Skip to main content

ONNX Runtime Session Module

Session objects handle model loading, input/output inspection, and inference execution.

Runtime configuration

onnxruntime.configure(opts)

assert(onnxruntime.configure({
log_severity_level = 2,
log_id = "my-runtime",
use_global_thread_pools = false,
global_intra_op_num_threads = 0,
global_inter_op_num_threads = 0,
}))

Notes:

  • It must be called before creating any session
  • Calling it again after an active session already exists returns an error

Supported fields:

  • log_severity_level
  • log_id
  • use_global_thread_pools
  • global_intra_op_num_threads
  • global_inter_op_num_threads

Create a session

onnxruntime.session(model_path[, opts])

session, err = onnxruntime.session(model_path, opts)

Loads an ONNX model from a file path.

onnxruntime.session_from_bytes(model_bytes[, opts])

session, err = onnxruntime.session_from_bytes(model_bytes, opts)

Creates a session from in-memory model bytes.

Session options

General fields

  • providers or provider Accepts a string or array of strings; the currently handled provider strings are "cpu" and "coreml", and aliases such as CPUExecutionProvider and CoreMLExecutionProvider are also accepted
  • fallback_to_cpu Boolean, default true
  • intra_op_num_threads
  • inter_op_num_threads
  • log_id
  • session_log_severity_level
  • session_log_verbosity_level
  • optimized_model_path
  • profile_file_prefix
  • free_dimension_overrides
  • config_entries
  • graph_optimization_level One of "disable", "basic", "extended", or "all"
  • execution_mode One of "sequential" or "parallel"
  • deterministic_compute
  • disable_per_session_threads
  • enable_cpu_mem_arena
  • enable_mem_pattern
  • custom_op_libraries

Additional notes:

  • free_dimension_overrides must be an array table, and each item must look like { by = "name"|"denotation", key = "...", value = integer }
  • config_entries must be a string-to-string table
  • custom_op_libraries may be a single path string, an array of paths, or handles returned by load_custom_op_library(); arrays may mix paths and handles
  • If providers is omitted, or the provider list is empty, the current implementation adds the CPU provider by default
  • When "coreml" is present and fallback_to_cpu = true, session creation can fall back to CPU if CoreML provider initialization fails
  • If you explicitly pass providers = {"coreml", "cpu"}, that order means CoreML first, CPU second

CoreML provider fields

When "coreml" is included in providers, these fields are also supported:

  • coreml_compute_units Prefer "all", "cpu_only", "cpu_and_gpu", or "cpu_and_neural_engine"; the parser also accepts aliases such as CPUOnly, CPUAndGPU, CPUAndNeuralEngine, and MLComputeUnits...
  • coreml_create_mlprogram
  • coreml_require_static_input_shapes
  • coreml_enable_on_subgraph
  • coreml_flags
  • coreml_use_cpu_only
  • coreml_use_cpu_and_gpu
  • coreml_only_enable_device_with_ane

Additional notes:

  • coreml_flags, coreml_use_cpu_only, coreml_use_cpu_and_gpu, and coreml_only_enable_device_with_ane are legacy compatibility fields
  • New and legacy fields can be mixed, but session creation fails immediately if they express conflicting meanings
  • coreml_only_enable_device_with_ane cannot be combined with mutually exclusive settings such as coreml_compute_units = "cpu_only" or "cpu_and_gpu"

Session object methods

Basic inspection

  • session:input_names()
  • session:output_names()
  • session:overridable_initializer_names()
  • session:input_count()
  • session:output_count()
  • session:overridable_initializer_count()

Type information

  • session:input_info(name_or_index)
  • session:output_info(name_or_index)
  • session:overridable_initializer_info(name_or_index)

The returned table may contain:

  • name
  • onnx_type
  • is_sparse
  • data_type
  • type
  • has_shape
  • shape
  • symbolic_shape
  • element
  • key_type
  • value

Notes:

  • tensor / sparse tensor entries include data_type, shape, and symbolic_shape
  • sequence / optional entries include nested element
  • map entries include key_type and nested value

Memory information

  • session:memory_info_for_inputs()
  • session:memory_info_for_outputs()

The result supports both numeric-order and name-based access. Individual entries commonly contain:

  • name
  • id
  • mem_type
  • allocator_type
  • device_type
  • device_mem_type
  • vendor_id

Metadata and lifecycle

  • session:metadata()
  • session:close()
  • session:end_profiling()
  • session:profiling_start_time_ns()
  • session:set_ep_dynamic_options(opts)
  • session:register_custom_op_library(path_or_handle)

Notes:

  • end_profiling() returns the profiling output file path
  • set_ep_dynamic_options() stringifies all keys and values before passing them to ORT
  • register_custom_op_library() rebuilds the underlying session using the current session options plus the new library
  • path_or_handle may be either a file path or a handle returned by load_custom_op_library()

Run inference

session:run(inputs[, output_names[, run_options]])

outputs, err = session:run({
input_ids = input_tensor,
attention_mask = mask_tensor,
}, {
"logits",
}, run_options)

session:run_into(inputs, outputs[, run_options])

outputs, err = session:run_into({
x = input_tensor,
}, {
y = reused_output_tensor,
}, run_options)

session:run_with_iobinding(binding[, run_options])

outputs, err = session:run_with_iobinding(binding, run_options)

Input rules:

  • inputs can be a positional array or a dictionary keyed by input name
  • Positional arrays follow model input order and can continue with overridable initializers
  • Dictionary keys must match input names or overridable initializer names
  • Optional inputs can be omitted, or passed as onnxruntime.optional(nil, type_info)

Output rules:

  • The return value is a table
  • Each output can be accessed both by numeric index and by output name
  • If run_into() reuses an existing output tensor, the returned table contains the same object

Run options

onnxruntime.run_options([opts])

local run_options = assert(onnxruntime.run_options({
tag = "session-run",
log_severity_level = 2,
log_verbosity_level = 1,
}))

Supported fields:

  • tag
  • log_severity_level
  • log_verbosity_level

Object methods:

  • run_options:tag([value])
  • run_options:log_severity_level([value])
  • run_options:log_verbosity_level([value])
  • run_options:terminate()
  • run_options:reset_terminate()

IOBinding

session:create_io_binding()

binding, err = session:create_io_binding()

binding:bind_input(name, value)

Binds an input value. Empty optionals are not accepted here.

binding:bind_output(name[, spec_or_tensor])

Supports three forms:

  • binding:bind_output("y") Binds to CPU memory and retrieves it later with get_outputs()
  • binding:bind_output("y", existing_tensor) Writes directly into an existing tensor
  • binding:bind_output("y", {type = "float32", shape = {1, 2}}) Creates and returns a new output tensor through the API

It also supports:

  • binding:bind_output("y", {mode = "device"})

Other methods

  • binding:clear_inputs()
  • binding:clear_outputs()
  • binding:synchronize_inputs()
  • binding:synchronize_outputs()
  • binding:get_outputs()

Example

local ort = require("onnxruntime")

local session = assert(ort.session(XXT_HOME_PATH.."/models/demo/model.onnx", {
providers = {"coreml", "cpu"},
fallback_to_cpu = true,
coreml_compute_units = "all",
}))

local x = assert(ort.tensor("float32", {1, 2}, {1.0, 2.0}))
local bias = assert(ort.tensor("float32", {1, 2}, {0.5, -0.5}))
local run_options = assert(ort.run_options({tag = "demo"}))

local outputs = assert(session:run({
x = x,
bias = bias,
}, {"y"}, run_options))

print(outputs.y:to_table()[1])