Batch Transform
inference-server supports both SageMaker Real-Time Inference and Batch Transform.
The following additional plugin hooks may be implemented to define Batch Transform execution parameters. One or multiple hooks may be implemented as required.
- batch_strategy() inference_server.BatchStrategy [source]
Return the default Batch Transform invocation strategy for this model
Default:
inference_server.BatchStrategy.MULTI_RECORD
If users do not specify a strategy when creating a Batch Transform job, the strategy returned by this hook will be used.
A model may support one or multiple invocation strategies depending on its implementation of the server hooks.
- max_concurrent_transforms() int [source]
Return the optimal maximum number of concurrent invocations for this model
Default:
1
If users do not specify a maximum number of concurrent transforms when creating a Batch Transform job, the value returned by this hook will be used.
- max_payload_in_mb() int [source]
Return the maximum allowed size in MB of a single record submitted by a Batch Transform job to the model
Default:
6
(MB)The value of
max_payload_in_mb()
×max_concurrent_transforms()
should be ≤ 100 MB.
See also
- SageMaker Inference Options
https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-model.html#deploy-model-options
- SageMaker Batch Transform execution parameters