Batch Transform

inference-server supports both SageMaker Real-Time Inference and Batch Transform.

The following additional plugin hooks may be implemented to define Batch Transform execution parameters. One or multiple hooks may be implemented as required.

batch_strategy() → inference_server.BatchStrategy[source]

Return the default Batch Transform invocation strategy for this model

Default: inference_server.BatchStrategy.MULTI_RECORD

If users do not specify a strategy when creating a Batch Transform job, the strategy returned by this hook will be used.

A model may support one or multiple invocation strategies depending on its implementation of the server hooks.

max_concurrent_transforms() → int[source]

Return the optimal maximum number of concurrent invocations for this model

Default: 1

If users do not specify a maximum number of concurrent transforms when creating a Batch Transform job, the value returned by this hook will be used.

max_payload_in_mb() → int[source]

Return the maximum allowed size in MB of a single record submitted by a Batch Transform job to the model

Default: 6 (MB)

The value of max_payload_in_mb() × max_concurrent_transforms() should be ≤ 100 MB.