Skip to content

Commit

Permalink
Add TRT-LLM backend to the doc
Browse files Browse the repository at this point in the history
  • Loading branch information
krishung5 committed Aug 7, 2024
1 parent 30fa78a commit e4a711c
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,14 @@ to load and serve models. The
[vllm_backend](https://github.com/triton-inference-server/vllm_backend) repo
contains the documentation and source for the backend.

**TensorRT-LLM**: The TensorRT-LLM backend allows you to serve
[TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) models with Triton Server.
Check out the
[Triton TRT-LLM user guide](https://github.com/triton-inference-server/server/blob/main/docs/getting_started/trtllm_user_guide.md)
for more information. The
[tensorrtllm_backend](https://github.com/triton-inference-server/tensorrtllm_backend)
repo contains the documentation and source for the backend.

**Important Note!** Not all the above backends are supported on every platform
supported by Triton. Look at the
[Backend-Platform Support Matrix](docs/backend_platform_support_matrix.md)
Expand Down

0 comments on commit e4a711c

Please sign in to comment.