Replies: 1 comment
-
I also have the same question.#930 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Can vllm support split llama-2-70B for 16 gpus?
Llama-2-70B has 8 key value heads. Can it split onto 16 gpus?
Beta Was this translation helpful? Give feedback.
All reactions