TorchServe
raw JSON → 0.12.0 verified Fri May 01 auth: no python
TorchServe is a flexible and easy-to-use tool for serving PyTorch models for inference at scale. Current version 0.12.0, released as part of the PyTorch ecosystem, with releases roughly quarterly.
pip install torchserve torch-model-archiver Common errors
error torchserve: error: unrecognized arguments: --model-store model_store ↓
cause The 'torchserve' command is not available because only the Python package was installed without the CLI.
fix
Install torchserve and torch-model-archiver: pip install torchserve torch-model-archiver
error ModuleNotFoundError: No module named 'torchserve' ↓
cause The torchserve package is not installed or is not importable in the current environment.
fix
Install torchserve: pip install torchserve
error Connection refused: POST http://localhost:8080/predictions/my_model ↓
cause TorchServe is not running or is running on a different port.
fix
Start TorchServe with
torchserve --start --model-store model_store and check the log for the port (default 8080). Warnings
breaking Default backend changed from synchronous to asynchronous in v0.9.0. If your code relies on immediate callback or synchronous response, you need to enable it explicitly. ↓
fix Use `--sync` flag or set `sync: true` in config.properties.
gotcha The 'torchserve' package does not include the CLI entry points; you must also install 'torch-model-archiver' for the `torch-model-archiver` command. ↓
fix Install both packages: pip install torchserve torch-model-archiver
gotcha TorchServe uses a custom handler API. Using a standard `nn.Module.forward()` directly will not work; you must wrap it in a handler class with `preprocess`, `inference`, and `postprocess` methods. ↓
fix Implement a handler class inheriting from `torchserve.base_handler.BaseHandler`.
deprecated The `--model-name` flag in `torch-model-archiver` is deprecated in favor of positional argument. ↓
fix Use `torch-model-archiver my_model --version 1.0 --model-file model.py ...` instead of `--model-name my_model`.
Imports
- torchserve
import torchserve
Quickstart
# Create a model archive
torch-model-archiver --model-name my_model --version 1.0 --model-file model.py --serialized-file model.pt --handler image_classifier
# Start TorchServe
torchserve --start --model-store model_store --models my_model=my_model.mar
# Send inference request
curl -X POST http://localhost:8080/predictions/my_model -T input.jpg