Switch the order of vLLM and TRT-LLM

triton-inference-server · Aug 27, 2024 · 73d3923 · 73d3923
1 parent 1d36963
commit 73d3923
Show file tree

Hide file tree

Showing 2 changed files with 9 additions and 9 deletions.
diff --git a/README.md b/README.md
@@ -115,14 +115,6 @@ random forest models. The
 [fil_backend](https://github.com/triton-inference-server/fil_backend) repo
 contains the documentation and source for the backend.
 
-**vLLM**: The vLLM backend is designed to run
-[supported models](https://vllm.readthedocs.io/en/latest/models/supported_models.html)
-on a [vLLM engine](https://github.com/vllm-project/vllm/blob/main/vllm/engine/async_llm_engine.py).
-This backend depends on [python_backend](https://github.com/triton-inference-server/python_backend)
-to load and serve models. The
-[vllm_backend](https://github.com/triton-inference-server/vllm_backend) repo
-contains the documentation and source for the backend.
-
 **TensorRT-LLM**: The TensorRT-LLM backend allows you to serve
 [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) models with Triton Server.
 Check out the
@@ -131,6 +123,14 @@ for more information. The
 [tensorrtllm_backend](https://github.com/triton-inference-server/tensorrtllm_backend)
 repo contains the documentation and source for the backend.
 
+**vLLM**: The vLLM backend is designed to run
+[supported models](https://vllm.readthedocs.io/en/latest/models/supported_models.html)
+on a [vLLM engine](https://github.com/vllm-project/vllm/blob/main/vllm/engine/async_llm_engine.py).
+This backend depends on [python_backend](https://github.com/triton-inference-server/python_backend)
+to load and serve models. The
+[vllm_backend](https://github.com/triton-inference-server/vllm_backend) repo
+contains the documentation and source for the backend.
+
 **Important Note!** Not all the above backends are supported on every platform
 supported by Triton. Look at the
 [Backend-Platform Support Matrix](docs/backend_platform_support_matrix.md)

diff --git a/docs/backend_platform_support_matrix.md b/docs/backend_platform_support_matrix.md
@@ -53,8 +53,8 @@ each backend on different platforms.
 | Python[^1]   |  :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU  |  :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU  |
 | DALI         |  :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU  | :heavy_check_mark: GPU[^2] <br/> :heavy_check_mark: CPU[^2] |
 | FIL          |  :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU  |  Unsupported  |
-| vLLM         |  :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU  |  Unsupported  |
 | TensorRT-LLM |  :heavy_check_mark: GPU <br/> :x: CPU | :heavy_check_mark: GPU <br/> :x: CPU       |
+| vLLM         |  :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU  |  Unsupported  |
 
 
 ## Windows 10