-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial tests for repo subcommand #21
Conversation
…when no mode is specified, unify server/profile code to helper functions, add more argparse default info
Co-authored-by: Francesco Petrini <[email protected]>
…triton_cli into rmccormick-test
def test_repo_add_vllm(self, model, source): | ||
self.repo_clear() | ||
self.repo_add(model, source) | ||
# TODO: Parse repo to find model, with vllm backend in config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like we should re-tool this to accept a search target parameter so it can detect any arbitrary backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some kind of repo validator would be great. For well-defined backends, it can validate more strictly:
all backends:
- config.pbtxt contains 'backend: "<backend>"'
- contains version folder
vllm:
- version folder contains model.json
# NOTE: would probably hold back on spending time on trt-llm validation right now,
# I think there are several competing simplification efforts of the backend going on.
# This validation logic may not hold true for long.
trt-llm
- contains expected/necessary composing models
- sets gpt_model_path/required params
onnx:
- version folder contains some *.onnx
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have some ideas, feel free to add a ticket. I probably won't implement this for initial testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This repo
subcommand may actually be a great place to experiment with an "ensemble" generator/helper/tool in the future - I think @oandreeva-nv was interested in something like this and has a ticket somewhere
For all outputs -> all inputs cases, I think it may be straightforward. For some inputs -> some outputs cases, it may be a cumbersome thing to define on a CLI. If there was some alternative way to define ensembles via a simple config (json, DAG, etc.) - then maybe the CLI would just consume that to generate the triton configs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A repo validator would be nice! I think we can leverage our client's tritonclient.grpc.model_config_pb2
module to validate the major things in the config file (e.g., syntax is valid, backend field exists, etc.) and then take the nice JSON form it dumps to perform more complex validation with ease (e.g., specified backend is valid, dims are correctly formed, etc.).
I think this this the ticket you are referring to. For my own clarity, "all outputs -> all inputs" refers to checking the config.pbtxt file for the ensemble model and verifying the inputs/outputs specified in the ensemble_scheduling
steps exist within the submodels, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this this the ticket you are referring to.
Yep that's the one. There are actually two features here - validating an ensemble config is correct (and providing perhaps a more useful error when incorrect, compared to what core outputs), and generating or simplifying the generation of an ensemble config.
For my own clarity, "all outputs -> all inputs" refers to checking the config.pbtxt file for the ensemble model and verifying the inputs/outputs specified in the ensemble_scheduling steps exist within the submodels, correct?
For this statement, I was thinking in terms of generating ensemble configs, rather than validating the. I meant if there was an unambiguous mapping that could be inferred when connecting Model A -> Model B (ex: Model A has 1 output, Model B has 1 input), we could just do it. If there is an ambiguous mapping (ModelA has 3 outputs, ModelB has 2 inputs, which get mapped?), we probably can't do much.
LGTM 🎊 |
Will follow-up with some of the TODOs and other tests separately. Want to get some baseline automated tests running and keep PRs on the smaller-side for easier reviews.