Methods on a general-purpose CoreML request

coreml_model_request_object is the inference object returned by coreml.new_model_request(...) / coreml.session(...).
It sends prepared input features into the model and returns results organized by output name.

Its responsibilities are intentionally narrow:

submit inference
read asynchronous results
inspect input/output signatures
inspect the current request configuration

It does not perform tokenization, image preprocessing, or business-level post-processing; those should be composed in Lua.

These methods are available in versions released after 20260319

Inference methods

`:predict(inputs[, opts])`

result, err = request:predict(inputs)

submitted, err = request:predict(inputs, {
    async = is_async,
    multi_array_output = "table" or "MLMultiArray",
    uses_cpu_only = cpu_only_for_this_call,
})

Runs inference for a single sample.

inputs must be a table keyed by input name
In synchronous mode, the result table is returned directly
In asynchronous mode, only true is returned; use :is_done() and :results() later
multi_array_output controls whether MLMultiArray outputs stay as native tensor objects or are converted to Lua tables
uses_cpu_only affects only this call and does not change the object's default configuration

`:run(inputs[, opts])`

run() is an alias of predict().

`:predict_batch(batch_inputs[, opts])`

batch_result, err = request:predict_batch({
    { input_ids = ids1 },
    { input_ids = ids2 },
}, {
    async = false,
    multi_array_output = "MLMultiArray",
})

Runs batch inference. Requires iOS 12+.

batch_inputs must be an array; each item is a single-sample input table keyed by input name
In synchronous mode, returns a batch result array; each element still follows the single-sample output structure
In asynchronous mode, returns true; call :results() later to retrieve the batch result
opts uses the same fields as predict()

`:run_batch(batch_inputs[, opts])`

run_batch() is an alias of predict_batch().

`:results([opts])`

result, err = request:results()

result, err = request:results({
    multi_array_output = "table" or "MLMultiArray",
})

Reads the result of the most recent asynchronous inference.

If the last async call came from predict(), this returns a single-sample result table
If the last async call came from predict_batch(), this returns a batch result array
multi_array_output defaults to the setting used by the latest inference call
If the async job is still running, it returns nil, "not yet"
If there is no successful completed result to read, it returns nil, "unknown"

`:is_done()`

done = request:is_done()

Checks whether the most recent asynchronous inference has completed. This only makes sense after predict(..., { async = true }) or predict_batch(..., { async = true }).

Runtime configuration and metadata

`:metadata()`

metadata = request:metadata()

Returns the model's built-in metadata.

`:uses_cpu_only()`

cpu_only = request:uses_cpu_only()

Returns the default CPU-only configuration saved when the request was created.

`:compute_units()`

compute_units = request:compute_units()

Returns the current compute_units string recorded by this request.

On iOS 12+, this returns the lowercased string recorded at creation time
Common results include "all", "cpu_only", "cpu", "cpu_and_gpu", "gpu", "cpu_and_neural_engine", "ane", and "neural_engine"
If the request was created with uses_cpu_only = true, this returns "cpu_only"
On iOS 11, it returns nil

Input and output signature methods

`:input_count()` / `:output_count()`

Return the number of input / output features.

`:input_features()` / `:output_features()`

Return name-keyed feature description tables.

The current implementation exposes a compact set of fields:

type
optional
For multi_array features only, shape and data_type

`:input_info(name_or_index)` / `:output_info(name_or_index)`

Read a single input / output description by name or by 1-based index.

The returned fields mostly match input_features() / output_features()
An extra name field is included
Numeric indexing follows the same ordering used by input_names() / output_names()

`:input_names()` / `:output_names()`

Return stable ordered lists of input / output names.

Names are sorted lexicographically
The order of output_names() matches the numeric index order in synchronous inference results

`:class_labels()`

Returns the model-declared class labels. Requires iOS 14+.

Lifecycle and type checks

`:close()`

Destroys the underlying request state. The object should not be used again after closing.

`:is_model_request()` / `:is_session()`

Object-level type-check helpers. These are equivalent aliases.

Notes

predict() / run() always return a table
Results support both numeric indexing and output-name indexing
predict_batch() / run_batch() return a batch result array; each item still supports both numeric and output-name access
In the newer general-purpose request APIs, MLMultiArray outputs are kept as native tensor objects by default for further processing
To inspect values or maintain compatibility with older table-based scripts, pass multi_array_output = "table"

Result access examples:

out[1]
out.text_features
batch_out[1].text_features

Example

local req = assert(coreml.new_model_request(XXT_HOME_PATH.."/models/demo.mlmodelc"))

local out = assert(req:predict({
    input_ids = ids,
}, {
    multi_array_output = "MLMultiArray",
}))

print(out[1])
print(out.text_features)
print(req:output_names())

Inference methods​

:predict(inputs[, opts])​

:run(inputs[, opts])​

:predict_batch(batch_inputs[, opts])​

:run_batch(batch_inputs[, opts])​

:results([opts])​

:is_done()​

Runtime configuration and metadata​

:metadata()​

:uses_cpu_only()​

:compute_units()​

Input and output signature methods​

:input_count() / :output_count()​

:input_features() / :output_features()​

:input_info(name_or_index) / :output_info(name_or_index)​

:input_names() / :output_names()​

:class_labels()​

Lifecycle and type checks​

:close()​

:is_model_request() / :is_session()​

Notes​

Example​