Estimate GPU Type or Total VRAM Required using HF Repo ID #8084
Unanswered
stikkireddy
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hey team trying to estimate gpu usage given a repo id from huggingface. Trying to determine whether to use A10, 4xA10, 8xA10, A100, 2xA100, 4xA100, 8xA100... No consumer cards.
I am able to estimate model weights using this:
But i am trying to understand given that i also have the
Is there a way to estimate the memory required given that I know the desired context length (ideally default provided from the config.json) and max number of output tokens. Assuming i use default settings provided in the args. It is missing computation for kv cache and intermediate states and misc 1-2gb overhead. Any help/guidance?
Beta Was this translation helpful? Give feedback.
All reactions