Running model on multiple GPUs? How to do it? Does vllm allows it do easily? #6500
martinenkoEduard
started this conversation in
General
Replies: 1 comment
-
you can insert this value : --tensor-parallel-size 2 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Running model on multiple GPUs? How to do it?
Can you show simple example?
What are the restrictions? Should GPUs be identical?
Or is it possible for instance to have
one RTX 3070 and one 3080? What about memory sharing?
Do mistal and llama support these features?
Do I need to install special drivers or special pytroch versions?
Also can you share a rig configuration with multiple GPUs for local LLM deployment?
Beta Was this translation helpful? Give feedback.
All reactions