ONNX Runtime Session Module
Session objects handle model loading, input/output inspection, and inference execution.
Runtime configuration
onnxruntime.configure(opts)
assert(onnxruntime.configure({
log_severity_level = 2,
log_id = "my-runtime",
use_global_thread_pools = false,
global_intra_op_num_threads = 0,
global_inter_op_num_threads = 0,
}))
Notes:
- It must be called before creating any session
- Calling it again after an active session already exists returns an error
Supported fields:
log_severity_levellog_iduse_global_thread_poolsglobal_intra_op_num_threadsglobal_inter_op_num_threads
Create a session
onnxruntime.session(model_path[, opts])
session, err = onnxruntime.session(model_path, opts)
Loads an ONNX model from a file path.
onnxruntime.session_from_bytes(model_bytes[, opts])
session, err = onnxruntime.session_from_bytes(model_bytes, opts)
Creates a session from in-memory model bytes.
Session options
General fields
providersorproviderAccepts a string or array of strings; the currently handled provider strings are"cpu"and"coreml", and aliases such asCPUExecutionProviderandCoreMLExecutionProviderare also acceptedfallback_to_cpuBoolean, defaulttrueintra_op_num_threadsinter_op_num_threadslog_idsession_log_severity_levelsession_log_verbosity_leveloptimized_model_pathprofile_file_prefixfree_dimension_overridesconfig_entriesgraph_optimization_levelOne of"disable","basic","extended", or"all"execution_modeOne of"sequential"or"parallel"deterministic_computedisable_per_session_threadsenable_cpu_mem_arenaenable_mem_patterncustom_op_libraries
Additional notes:
free_dimension_overridesmust be an array table, and each item must look like{ by = "name"|"denotation", key = "...", value = integer }config_entriesmust be a string-to-string tablecustom_op_librariesmay be a single path string, an array of paths, or handles returned byload_custom_op_library(); arrays may mix paths and handles- If
providersis omitted, or the provider list is empty, the current implementation adds the CPU provider by default - When
"coreml"is present andfallback_to_cpu = true, session creation can fall back to CPU if CoreML provider initialization fails - If you explicitly pass
providers = {"coreml", "cpu"}, that order means CoreML first, CPU second
CoreML provider fields
When "coreml" is included in providers, these fields are also supported:
coreml_compute_unitsPrefer"all","cpu_only","cpu_and_gpu", or"cpu_and_neural_engine"; the parser also accepts aliases such asCPUOnly,CPUAndGPU,CPUAndNeuralEngine, andMLComputeUnits...coreml_create_mlprogramcoreml_require_static_input_shapescoreml_enable_on_subgraphcoreml_flagscoreml_use_cpu_onlycoreml_use_cpu_and_gpucoreml_only_enable_device_with_ane
Additional notes:
coreml_flags,coreml_use_cpu_only,coreml_use_cpu_and_gpu, andcoreml_only_enable_device_with_aneare legacy compatibility fields- New and legacy fields can be mixed, but session creation fails immediately if they express conflicting meanings
coreml_only_enable_device_with_anecannot be combined with mutually exclusive settings such ascoreml_compute_units = "cpu_only"or"cpu_and_gpu"
Session object methods
Basic inspection
session:input_names()session:output_names()session:overridable_initializer_names()session:input_count()session:output_count()session:overridable_initializer_count()
Type information
session:input_info(name_or_index)session:output_info(name_or_index)session:overridable_initializer_info(name_or_index)
The returned table may contain:
nameonnx_typeis_sparsedata_typetypehas_shapeshapesymbolic_shapeelementkey_typevalue
Notes:
- tensor / sparse tensor entries include
data_type,shape, andsymbolic_shape - sequence / optional entries include nested
element - map entries include
key_typeand nestedvalue
Memory information
session:memory_info_for_inputs()session:memory_info_for_outputs()
The result supports both numeric-order and name-based access. Individual entries commonly contain:
nameidmem_typeallocator_typedevice_typedevice_mem_typevendor_id
Metadata and lifecycle
session:metadata()session:close()session:end_profiling()session:profiling_start_time_ns()session:set_ep_dynamic_options(opts)session:register_custom_op_library(path_or_handle)
Notes:
end_profiling()returns the profiling output file pathset_ep_dynamic_options()stringifies all keys and values before passing them to ORTregister_custom_op_library()rebuilds the underlying session using the current session options plus the new librarypath_or_handlemay be either a file path or a handle returned byload_custom_op_library()
Run inference
session:run(inputs[, output_names[, run_options]])
outputs, err = session:run({
input_ids = input_tensor,
attention_mask = mask_tensor,
}, {
"logits",
}, run_options)
session:run_into(inputs, outputs[, run_options])
outputs, err = session:run_into({
x = input_tensor,
}, {
y = reused_output_tensor,
}, run_options)
session:run_with_iobinding(binding[, run_options])
outputs, err = session:run_with_iobinding(binding, run_options)
Input rules:
inputscan be a positional array or a dictionary keyed by input name- Positional arrays follow model input order and can continue with overridable initializers
- Dictionary keys must match input names or overridable initializer names
- Optional inputs can be omitted, or passed as
onnxruntime.optional(nil, type_info)
Output rules:
- The return value is a table
- Each output can be accessed both by numeric index and by output name
- If
run_into()reuses an existing output tensor, the returned table contains the same object
Run options
onnxruntime.run_options([opts])
local run_options = assert(onnxruntime.run_options({
tag = "session-run",
log_severity_level = 2,
log_verbosity_level = 1,
}))
Supported fields:
taglog_severity_levellog_verbosity_level
Object methods:
run_options:tag([value])run_options:log_severity_level([value])run_options:log_verbosity_level([value])run_options:terminate()run_options:reset_terminate()
IOBinding
session:create_io_binding()
binding, err = session:create_io_binding()
binding:bind_input(name, value)
Binds an input value. Empty optionals are not accepted here.
binding:bind_output(name[, spec_or_tensor])
Supports three forms:
binding:bind_output("y")Binds to CPU memory and retrieves it later withget_outputs()binding:bind_output("y", existing_tensor)Writes directly into an existing tensorbinding:bind_output("y", {type = "float32", shape = {1, 2}})Creates and returns a new output tensor through the API
It also supports:
binding:bind_output("y", {mode = "device"})
Other methods
binding:clear_inputs()binding:clear_outputs()binding:synchronize_inputs()binding:synchronize_outputs()binding:get_outputs()
Example
local ort = require("onnxruntime")
local session = assert(ort.session(XXT_HOME_PATH.."/models/demo/model.onnx", {
providers = {"coreml", "cpu"},
fallback_to_cpu = true,
coreml_compute_units = "all",
}))
local x = assert(ort.tensor("float32", {1, 2}, {1.0, 2.0}))
local bias = assert(ort.tensor("float32", {1, 2}, {0.5, -0.5}))
local run_options = assert(ort.run_options({tag = "demo"}))
local outputs = assert(session:run({
x = x,
bias = bias,
}, {"y"}, run_options))
print(outputs.y:to_table()[1])