Batch Transform
inference-server supports both SageMaker Real-Time Inference and Batch Transform.
The following additional plugin hooks may be implemented to define Batch Transform execution parameters. One or multiple hooks may be implemented as required.
- batch_strategy() inference_server.BatchStrategy[source]
Return the default Batch Transform invocation strategy for this model
Default:
inference_server.BatchStrategy.MULTI_RECORDIf users do not specify a strategy when creating a Batch Transform job, the strategy returned by this hook will be used.
A model may support one or multiple invocation strategies depending on its implementation of the server hooks.
- max_concurrent_transforms() int[source]
Return the optimal maximum number of concurrent invocations for this model
Default:
1If users do not specify a maximum number of concurrent transforms when creating a Batch Transform job, the value returned by this hook will be used.
- max_payload_in_mb() int[source]
Return the maximum allowed size in MB of a single record submitted by a Batch Transform job to the model
Default:
6(MB)The value of
max_payload_in_mb()×max_concurrent_transforms()should be ≤ 100 MB.
See also
- SageMaker Inference Options
https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-model.html#deploy-model-options
- SageMaker Batch Transform execution parameters