Skip to main content

ONNX Runtime Module

The onnxruntime module loads and runs ONNX models directly on device, suitable for text, embeddings, classification, detection, and general-purpose tensor inference workflows.

Only supported on iOS 13 and later

Load the module

local ort = require("onnxruntime")

Unlike coreml, this is not a built-in global module; it is loaded on demand.

After require("onnxruntime") succeeds, it also injects two native copy bridges into the built-in coreml APIs:

  • coreml.multi_array_from_ort_tensor(tensor[, data_type])
  • multi_array:to_ort_tensor([data_type])

Both conversions are native-layer copies and do not go through Lua tables.

Module-level functions

Runtime and basic information

  • onnxruntime.version()
  • onnxruntime.providers()
  • onnxruntime.configure(opts)

Notes:

  • providers() returns the execution providers actually available in the current ORT runtime
  • configure() sets global runtime defaults and must be called before creating any session

Tensors, images, and numeric helpers

  • onnxruntime.tensor(type, shape[, data])
  • onnxruntime.tensor_from_bytes(type, shape, bytes)
  • onnxruntime.tensor_from_cv_mat(mat[, opts])
  • onnxruntime.tensor_from_quad(mat, quad[, opts])
  • onnxruntime.tensor_from_quads(mat, quads[, opts])
  • onnxruntime.tensor_from_image(image[, opts])
  • onnxruntime.tensor_from_images(images[, opts])
  • onnxruntime.image_from_tensor(tensor[, opts])
  • onnxruntime.clamp(tensor, min, max)
  • onnxruntime.sigmoid(tensor)
  • onnxruntime.exp(tensor)
  • onnxruntime.where(condition, x, y)
  • onnxruntime.matmul(lhs, rhs)
  • onnxruntime.concat(tensors[, axis])
  • onnxruntime.stack(tensors[, axis])

Notes:

  • clamp(), sigmoid(), exp(), and matmul() are equivalent to the same-named tensor: methods, with the tensor passed as the first argument
  • where() supports mixing scalars, booleans, and tensors, and applies broadcasting rules
  • Image preprocessing, OpenCV bridges, and image_from_tensor() details are documented in the Tensor Module

Detection, decoding, and post-processing helpers

  • onnxruntime.nms(boxes, scores[, opts])
  • onnxruntime.box_points(rotated_boxes)
  • onnxruntime.xywh_to_xyxy(boxes)
  • onnxruntime.xyxy_to_xywh(boxes)
  • onnxruntime.rotated_iou(lhs_box, rhs_box)
  • onnxruntime.rotated_nms(boxes, scores[, opts])
  • onnxruntime.create_decoder(schema)
  • onnxruntime.decode_yolo(output[, opts])
  • onnxruntime.decode_yolo_obb(output[, opts])
  • onnxruntime.decode_matrix_candidates(output, schema[, opts])
  • onnxruntime.decode_dense_detection(output, opts)
  • onnxruntime.records_from_boxes(boxes, scores, class_ids[, keep_indices])
  • onnxruntime.obb_records_from_rows(rows, scores, class_ids[, angles[, keep_indices[, opts]]])
  • onnxruntime.points_to_records(points[, opts])
  • onnxruntime.threshold_masks(masks, threshold)
  • onnxruntime.crop_masks_by_boxes(masks, boxes)
  • onnxruntime.resize_masks(masks, width, height[, opts])
  • onnxruntime.mask_iou(lhs_mask, rhs_mask)
  • onnxruntime.mask_to_polygon(mask[, opts])
  • onnxruntime.proto_masks(proto, coeffs, boxes, image_width, image_height[, opts])
  • onnxruntime.project_masks(proto, coeffs, boxes, image_width, image_height[, opts])
  • onnxruntime.db_postprocess(score_map[, opts])
  • onnxruntime.tracker([opts])
  • onnxruntime.reshape_keypoints(points[, keypoint_count[, keypoint_dim|opts]])
  • onnxruntime.scale_boxes(boxes, transform)
  • onnxruntime.clip_boxes(boxes, clip_width, clip_height)
  • onnxruntime.scale_points(points, transform[, opts])
  • onnxruntime.scale_keypoints(points, transform[, opts])
  • onnxruntime.clip_keypoints(points, clip_width, clip_height[, opts])
  • onnxruntime.ctc_greedy_decode(logits[, opts])
  • onnxruntime.sample_logits(logits[, opts])

Notes:

  • tensor_from_quad() / tensor_from_quads() require require("image.cv") first and are useful when OCR pipelines need to crop quadrilateral regions directly into tensors
  • box_points() takes a rotated-box tensor shaped [5], [1, 5], or [N, 5]; it does not take five separate scalar arguments
  • create_decoder() returns a decoder object with :decode(), :task(), and :schema()
  • tracker() returns a tracker object with :update(), :reset(), :state(), and :close()
  • records_from_boxes(), obb_records_from_rows(), and points_to_records() reshape tensor outputs into Lua-friendly record tables
  • proto_masks() and project_masks() currently share the same implementation; project_masks() is an alias
  • mask_iou() computes the intersection-over-union between two masks directly, and also accepts a third opts argument with compare_size = true, or explicit width / height
  • db_postprocess() is intended for DB / DBNet-style text-detection post-processing; each returned detection contains score, points, and box
  • decode_dense_detection() requires opts.strides, and it must be a non-empty array of positive integers; it also requires decode_width and decode_height, and currently supports only box_encoding = "grid_center_log_wh"
  • ctc_greedy_decode() supports blank_index, merge_repeated, apply_softmax, return_probabilities, and charset
  • ctc_greedy_decode() always returns indices; text appears only when charset is provided; confidence appears only when apply_softmax or return_probabilities is enabled; probabilities and probability_confidence appear only when return_probabilities is enabled
  • nms() and rotated_nms() return int64 tensors with 1-based indices
  • sample_logits() supports argmax, temperature, top_k, top_p, min_p, and seed
  • sample_logits() returns a scalar index for 1D logits, and an int64 tensor for batched logits

Structured values

  • onnxruntime.value(value)
  • onnxruntime.optional(value, type_info)
  • onnxruntime.sequence(items)
  • onnxruntime.map(key_type, value_type, pairs)
  • onnxruntime.sparse_tensor(type, dense_shape, indices, values)
  • onnxruntime.sparse_tensor_from_dense(tensor)

These are useful when model inputs or outputs are not plain tensors — for example optionals, sequences, maps, or sparse tensors.

The current behavior can be summarized like this:

  • onnxruntime.value(x) If x is already an ORT tensor / value / sequence / map / sparse tensor, it is returned as-is; if x is a Lua table, it becomes a sequence; otherwise the scalar is wrapped as a tensor
  • onnxruntime.optional(value, type_info) The second argument is required; type_info may be a string or a type-info table such as the result of session:input_info(...) / output_info(...); an empty optional is written as onnxruntime.optional(nil, type_info)
  • onnxruntime.map(key_type, value_type, pairs) At the moment, key_type only supports "string" or "int64"
  • onnxruntime.sparse_tensor(type, dense_shape, indices, values) Currently only numeric / bool sparse tensors are supported, built in COO form; indices may be flat or nested
  • onnxruntime.sparse_tensor_from_dense(tensor) Currently does not support string tensors

Common object methods:

  • value:type() / value:has_value() / value:get()
  • sequence:length() / sequence:get(i) / sequence:items()
  • map:get(key) / map:set(key, value) / map:keys() / map:pairs()
  • sparse_tensor:dense_shape() / sparse_tensor:values() / sparse_tensor:indices() / sparse_tensor:format() / sparse_tensor:to_dense()

Sessions and inference

  • onnxruntime.session(model_path[, opts])
  • onnxruntime.session_from_bytes(model_bytes[, opts])
  • onnxruntime.run_options([opts])
  • onnxruntime.load_custom_op_library(path)

Session objects handle model loading, input/output inspection, inference execution, and IOBinding. See the Session Module.

Supported element types

The tensor APIs currently support these type names:

  • "float32" / "float"
  • "float16"
  • "bfloat16"
  • "uint8"
  • "uint16"
  • "uint32"
  • "uint64"
  • "int8"
  • "int16"
  • "int32"
  • "int64"
  • "double" / "float64"
  • "bool"
  • "string"

Notes:

  • tensor_from_bytes() and copy_from_bytes() only support numeric and bool tensors
  • bytes() is not available for string tensors
  • tensor:to("string") currently only supports string -> string

Provider notes

onnxruntime.providers() returns what the runtime reports as available, but the provider strings currently recognized by the session option parser are:

  • "cpu"
  • "coreml"

Notes:

  • provider / providers also accept aliases such as CPUExecutionProvider and CoreMLExecutionProvider; they are normalized internally to "cpu" and "coreml"
  • If no provider is specified, or the provider list is empty, session creation appends the CPU provider automatically
  • If the provider list contains "coreml" and fallback_to_cpu = true, the implementation may also append CPU as a fallback path
  • If you pass providers = {"coreml", "cpu"}, the session tries CoreML first and CPU second

Working with CoreML

If you want to reuse the coreml tokenizer or an MLMultiArray preprocessing pipeline, a common pattern is:

local ort = require("onnxruntime")

local tokenizer = assert(coreml.new_text_tokenizer({
type = "wordpiece",
vocab_path = XXT_HOME_PATH.."/models/demo/vocab.txt",
context_length = 52,
}))

local input_ids = assert(tokenizer:encode("hello", {
output = "ort_tensor",
}))

Or convert an existing MLMultiArray directly into an ORT tensor:

local ort = require("onnxruntime")
local tensor = assert(multi_array:to_ort_tensor("int64"))