ONNX Runtime Module

The onnxruntime module loads and runs ONNX models directly on device, suitable for text, embeddings, classification, detection, and general-purpose tensor inference workflows.

Only supported on iOS 13 and later

Load the module

local ort = require("onnxruntime")

Unlike coreml, this is not a built-in global module; it is loaded on demand.

After require("onnxruntime") succeeds, it also injects two native copy bridges into the built-in coreml APIs:

coreml.multi_array_from_ort_tensor(tensor[, data_type])
multi_array:to_ort_tensor([data_type])

Both conversions are native-layer copies and do not go through Lua tables.

Module-level functions

Runtime and basic information

onnxruntime.version()
onnxruntime.providers()
onnxruntime.configure(opts)

Notes:

providers() returns the execution providers actually available in the current ORT runtime
configure() sets global runtime defaults and must be called before creating any session

Tensors, images, and numeric helpers

onnxruntime.tensor(type, shape[, data])
onnxruntime.tensor_from_bytes(type, shape, bytes)
onnxruntime.tensor_from_cv_mat(mat[, opts])
onnxruntime.tensor_from_quad(mat, quad[, opts])
onnxruntime.tensor_from_quads(mat, quads[, opts])
onnxruntime.tensor_from_image(image[, opts])
onnxruntime.tensor_from_images(images[, opts])
onnxruntime.image_from_tensor(tensor[, opts])
onnxruntime.clamp(tensor, min, max)
onnxruntime.sigmoid(tensor)
onnxruntime.exp(tensor)
onnxruntime.where(condition, x, y)
onnxruntime.matmul(lhs, rhs)
onnxruntime.concat(tensors[, axis])
onnxruntime.stack(tensors[, axis])

Notes:

clamp(), sigmoid(), exp(), and matmul() are equivalent to the same-named tensor: methods, with the tensor passed as the first argument
where() supports mixing scalars, booleans, and tensors, and applies broadcasting rules
Image preprocessing, OpenCV bridges, and image_from_tensor() details are documented in the Tensor Module

Detection, decoding, and post-processing helpers

onnxruntime.nms(boxes, scores[, opts])
onnxruntime.box_points(rotated_boxes)
onnxruntime.xywh_to_xyxy(boxes)
onnxruntime.xyxy_to_xywh(boxes)
onnxruntime.rotated_iou(lhs_box, rhs_box)
onnxruntime.rotated_nms(boxes, scores[, opts])
onnxruntime.create_decoder(schema)
onnxruntime.decode_yolo(output[, opts])
onnxruntime.decode_yolo_obb(output[, opts])
onnxruntime.decode_matrix_candidates(output, schema[, opts])
onnxruntime.decode_dense_detection(output, opts)
onnxruntime.records_from_boxes(boxes, scores, class_ids[, keep_indices])
onnxruntime.obb_records_from_rows(rows, scores, class_ids[, angles[, keep_indices[, opts]]])
onnxruntime.points_to_records(points[, opts])
onnxruntime.threshold_masks(masks, threshold)
onnxruntime.crop_masks_by_boxes(masks, boxes)
onnxruntime.resize_masks(masks, width, height[, opts])
onnxruntime.mask_iou(lhs_mask, rhs_mask)
onnxruntime.mask_to_polygon(mask[, opts])
onnxruntime.proto_masks(proto, coeffs, boxes, image_width, image_height[, opts])
onnxruntime.project_masks(proto, coeffs, boxes, image_width, image_height[, opts])
onnxruntime.db_postprocess(score_map[, opts])
onnxruntime.tracker([opts])
onnxruntime.reshape_keypoints(points[, keypoint_count[, keypoint_dim|opts]])
onnxruntime.scale_boxes(boxes, transform)
onnxruntime.clip_boxes(boxes, clip_width, clip_height)
onnxruntime.scale_points(points, transform[, opts])
onnxruntime.scale_keypoints(points, transform[, opts])
onnxruntime.clip_keypoints(points, clip_width, clip_height[, opts])
onnxruntime.ctc_greedy_decode(logits[, opts])
onnxruntime.sample_logits(logits[, opts])

Notes:

tensor_from_quad() / tensor_from_quads() require require("image.cv") first and are useful when OCR pipelines need to crop quadrilateral regions directly into tensors
box_points() takes a rotated-box tensor shaped [5], [1, 5], or [N, 5]; it does not take five separate scalar arguments
create_decoder() returns a decoder object with :decode(), :task(), and :schema()
tracker() returns a tracker object with :update(), :reset(), :state(), and :close()
records_from_boxes(), obb_records_from_rows(), and points_to_records() reshape tensor outputs into Lua-friendly record tables
proto_masks() and project_masks() currently share the same implementation; project_masks() is an alias
mask_iou() computes the intersection-over-union between two masks directly, and also accepts a third opts argument with compare_size = true, or explicit width / height
db_postprocess() is intended for DB / DBNet-style text-detection post-processing; each returned detection contains score, points, and box
decode_dense_detection() requires opts.strides, and it must be a non-empty array of positive integers; it also requires decode_width and decode_height, and currently supports only box_encoding = "grid_center_log_wh"
ctc_greedy_decode() supports blank_index, merge_repeated, apply_softmax, return_probabilities, and charset
ctc_greedy_decode() always returns indices; text appears only when charset is provided; confidence appears only when apply_softmax or return_probabilities is enabled; probabilities and probability_confidence appear only when return_probabilities is enabled
nms() and rotated_nms() return int64 tensors with 1-based indices
sample_logits() supports argmax, temperature, top_k, top_p, min_p, and seed
sample_logits() returns a scalar index for 1D logits, and an int64 tensor for batched logits

Structured values

onnxruntime.value(value)
onnxruntime.optional(value, type_info)
onnxruntime.sequence(items)
onnxruntime.map(key_type, value_type, pairs)
onnxruntime.sparse_tensor(type, dense_shape, indices, values)
onnxruntime.sparse_tensor_from_dense(tensor)

These are useful when model inputs or outputs are not plain tensors — for example optionals, sequences, maps, or sparse tensors.

The current behavior can be summarized like this:

onnxruntime.value(x) If x is already an ORT tensor / value / sequence / map / sparse tensor, it is returned as-is; if x is a Lua table, it becomes a sequence; otherwise the scalar is wrapped as a tensor
onnxruntime.optional(value, type_info) The second argument is required; type_info may be a string or a type-info table such as the result of session:input_info(...) / output_info(...); an empty optional is written as onnxruntime.optional(nil, type_info)
onnxruntime.map(key_type, value_type, pairs) At the moment, key_type only supports "string" or "int64"
onnxruntime.sparse_tensor(type, dense_shape, indices, values) Currently only numeric / bool sparse tensors are supported, built in COO form; indices may be flat or nested
onnxruntime.sparse_tensor_from_dense(tensor) Currently does not support string tensors

Common object methods:

value:type() / value:has_value() / value:get()
sequence:length() / sequence:get(i) / sequence:items()
map:get(key) / map:set(key, value) / map:keys() / map:pairs()
sparse_tensor:dense_shape() / sparse_tensor:values() / sparse_tensor:indices() / sparse_tensor:format() / sparse_tensor:to_dense()

Sessions and inference

onnxruntime.session(model_path[, opts])
onnxruntime.session_from_bytes(model_bytes[, opts])
onnxruntime.run_options([opts])
onnxruntime.load_custom_op_library(path)

Session objects handle model loading, input/output inspection, inference execution, and IOBinding. See the Session Module.

Supported element types

The tensor APIs currently support these type names:

"float32" / "float"
"float16"
"bfloat16"
"uint8"
"uint16"
"uint32"
"uint64"
"int8"
"int16"
"int32"
"int64"
"double" / "float64"
"bool"
"string"

Notes:

tensor_from_bytes() and copy_from_bytes() only support numeric and bool tensors
bytes() is not available for string tensors
tensor:to("string") currently only supports string -> string

Provider notes

onnxruntime.providers() returns what the runtime reports as available, but the provider strings currently recognized by the session option parser are:

"cpu"
"coreml"

Notes:

provider / providers also accept aliases such as CPUExecutionProvider and CoreMLExecutionProvider; they are normalized internally to "cpu" and "coreml"
If no provider is specified, or the provider list is empty, session creation appends the CPU provider automatically
If the provider list contains "coreml" and fallback_to_cpu = true, the implementation may also append CPU as a fallback path
If you pass providers = {"coreml", "cpu"}, the session tries CoreML first and CPU second

Working with CoreML

If you want to reuse the coreml tokenizer or an MLMultiArray preprocessing pipeline, a common pattern is:

local ort = require("onnxruntime")

local tokenizer = assert(coreml.new_text_tokenizer({
    type = "wordpiece",
    vocab_path = XXT_HOME_PATH.."/models/demo/vocab.txt",
    context_length = 52,
}))

local input_ids = assert(tokenizer:encode("hello", {
    output = "ort_tensor",
}))

Or convert an existing MLMultiArray directly into an ORT tensor:

local ort = require("onnxruntime")
local tensor = assert(multi_array:to_ort_tensor("int64"))

Load the module​

Module-level functions​

Runtime and basic information​

Tensors, images, and numeric helpers​

Detection, decoding, and post-processing helpers​

Structured values​

Sessions and inference​

Supported element types​

Provider notes​

Working with CoreML​

Load the module

Module-level functions

Runtime and basic information

Tensors, images, and numeric helpers

Detection, decoding, and post-processing helpers

Structured values

Sessions and inference

Supported element types

Provider notes

Working with CoreML