The simplest way to serve AI/ML models in production
- Write once, run anywhere: Package and test model code, weights, and dependencies with a model server that behaves the same in development and production.
- Fast developer loop: Implement your model with fast feedback from a live reload server, and skip Docker and Kubernetes configuration with Truss' done-for-you model serving environment.
- Support for all Python frameworks: From
transformers
anddiffusors
toPyTorch
andTensorflow
toXGBoost
andsklearn
, Truss supports models created with any framework, even entirely custom models.
See Trusses for popular models including:
- 🦅 Falcon 40B
- 🧙 WizardLM
- 🎨 Stable Diffusion
- 🗣 Whisper
and dozens more examples.
Install Truss with:
pip install --upgrade truss
As a quick example, we'll package a text classification pipeline from the open-source transformers
package.
To get started, create a Truss with the following terminal command:
truss init text-classification
This will create an empty Truss at ./text-classification
.
The model serving code goes in ./text-classification/model/model.py
in your newly created Truss.
from typing import List
from transformers import pipeline
class Model:
def __init__(self, **kwargs) -> None:
self._model = None
def load(self):
self._model = pipeline("text-classification")
def predict(self, model_input: str) -> List:
return self._model(model_input)
There are two functions to implement:
load()
runs once when the model is spun up and is responsible for initializingself._model
predict()
runs each time the model is invoked and handles the inference. It can use any JSON-serializable type as input and output.
The pipeline model relies on Transformers and PyTorch. These dependencies must be specified in the Truss config.
In ./text-classification/config.yaml
, find the line requirements
. Replace the empty list with:
requirements:
- torch==2.0.1
- transformers==4.30.0
No other configuration needs to be changed.
You can run a Truss server locally by:
cd ./text-classification
truss run-image
Truss is backed by Baseten and built in collaboration with ML engineers worldwide. Special thanks to Stephan Auerhahn @ stability.ai and Daniel Sarfati @ Salad Technologies for their contributions.
We enthusiastically welcome contributions in accordance with our contributors' guide and code of conduct.