Skip to content

Commit

Permalink
Switch the order of vLLM and TRT-LLM
Browse files Browse the repository at this point in the history
  • Loading branch information
krishung5 committed Aug 27, 2024
1 parent 1d36963 commit 73d3923
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 9 deletions.
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,14 +115,6 @@ random forest models. The
[fil_backend](https://github.com/triton-inference-server/fil_backend) repo
contains the documentation and source for the backend.

**vLLM**: The vLLM backend is designed to run
[supported models](https://vllm.readthedocs.io/en/latest/models/supported_models.html)
on a [vLLM engine](https://github.com/vllm-project/vllm/blob/main/vllm/engine/async_llm_engine.py).
This backend depends on [python_backend](https://github.com/triton-inference-server/python_backend)
to load and serve models. The
[vllm_backend](https://github.com/triton-inference-server/vllm_backend) repo
contains the documentation and source for the backend.

**TensorRT-LLM**: The TensorRT-LLM backend allows you to serve
[TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) models with Triton Server.
Check out the
Expand All @@ -131,6 +123,14 @@ for more information. The
[tensorrtllm_backend](https://github.com/triton-inference-server/tensorrtllm_backend)
repo contains the documentation and source for the backend.

**vLLM**: The vLLM backend is designed to run
[supported models](https://vllm.readthedocs.io/en/latest/models/supported_models.html)
on a [vLLM engine](https://github.com/vllm-project/vllm/blob/main/vllm/engine/async_llm_engine.py).
This backend depends on [python_backend](https://github.com/triton-inference-server/python_backend)
to load and serve models. The
[vllm_backend](https://github.com/triton-inference-server/vllm_backend) repo
contains the documentation and source for the backend.

**Important Note!** Not all the above backends are supported on every platform
supported by Triton. Look at the
[Backend-Platform Support Matrix](docs/backend_platform_support_matrix.md)
Expand Down
2 changes: 1 addition & 1 deletion docs/backend_platform_support_matrix.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,8 @@ each backend on different platforms.
| Python[^1] | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU |
| DALI | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU | :heavy_check_mark: GPU[^2] <br/> :heavy_check_mark: CPU[^2] |
| FIL | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU | Unsupported |
| vLLM | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU | Unsupported |
| TensorRT-LLM | :heavy_check_mark: GPU <br/> :x: CPU | :heavy_check_mark: GPU <br/> :x: CPU |
| vLLM | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU | Unsupported |


## Windows 10
Expand Down

0 comments on commit 73d3923

Please sign in to comment.