inference_server
Pluggable Python HTTP web service (WSGI) for real-time AI/ML model inference compatible with Amazon SageMaker
- class BatchStrategy(value)[source]
Bases:
EnumEnumeration of Batch Transform invocation strategies
Specifies the number of records to include in a mini-batch for an HTTP inference request. A record is a single unit of input data that inference can be made on. For example, a single line in a CSV file is a record.
- MULTI_RECORD = 'MultiRecord'
Batch Transform job to invoke the model with multiple records per request
- SINGLE_RECORD = 'SingleRecord'
Batch Transform job to invoke the model with a single record per request
- class MIMEAccept(values: Accept | Iterable[tuple[str, float]] | None = ())[source]
Bases:
AcceptLike
Acceptbut with special methods and behavior for mimetypes.