Have you compared with FasterTransformer #264
-
Have you compared with https://github.com/NVIDIA/FasterTransformer and NVIDIA/FasterTransformer#506 ? |
Beta Was this translation helpful? Give feedback.
Answered by
zhuohan123
Jun 26, 2023
Replies: 1 comment 5 replies
-
Thanks for the question. Yes we have compared the performance with FasterTransformer in our research paper (will be released soon). We can achieve up to a up to 22x speedup compared to FasterTransformer. The main gain comes from the PagedAttention and continuous batching implemented in vLLM. |
Beta Was this translation helpful? Give feedback.
5 replies
Answer selected by
zhuohan123
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks for the question. Yes we have compared the performance with FasterTransformer in our research paper (will be released soon). We can achieve up to a up to 22x speedup compared to FasterTransformer. The main gain comes from the PagedAttention and continuous batching implemented in vLLM.