Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial tests for repo subcommand #21

Merged
merged 12 commits into from
Jan 10, 2024
Merged

Add initial tests for repo subcommand #21

merged 12 commits into from
Jan 10, 2024

Conversation

rmccorm4
Copy link
Collaborator

@rmccorm4 rmccorm4 commented Jan 10, 2024

Will follow-up with some of the TODOs and other tests separately. Want to get some baseline automated tests running and keep PRs on the smaller-side for easier reviews.

tests/test_cli.py Outdated Show resolved Hide resolved
def test_repo_add_vllm(self, model, source):
self.repo_clear()
self.repo_add(model, source)
# TODO: Parse repo to find model, with vllm backend in config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like we should re-tool this to accept a search target parameter so it can detect any arbitrary backend.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some kind of repo validator would be great. For well-defined backends, it can validate more strictly:

all backends:
- config.pbtxt contains 'backend: "<backend>"'
- contains version folder

vllm:
- version folder contains model.json

# NOTE: would probably hold back on spending time on trt-llm validation right now,
# I think there are several competing simplification efforts of the backend going on.
# This validation logic may not hold true for long.
trt-llm
- contains expected/necessary composing models
- sets gpt_model_path/required params

onnx:
- version folder contains some *.onnx

...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have some ideas, feel free to add a ticket. I probably won't implement this for initial testing.

Copy link
Collaborator Author

@rmccorm4 rmccorm4 Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This repo subcommand may actually be a great place to experiment with an "ensemble" generator/helper/tool in the future - I think @oandreeva-nv was interested in something like this and has a ticket somewhere

For all outputs -> all inputs cases, I think it may be straightforward. For some inputs -> some outputs cases, it may be a cumbersome thing to define on a CLI. If there was some alternative way to define ensembles via a simple config (json, DAG, etc.) - then maybe the CLI would just consume that to generate the triton configs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A repo validator would be nice! I think we can leverage our client's tritonclient.grpc.model_config_pb2 module to validate the major things in the config file (e.g., syntax is valid, backend field exists, etc.) and then take the nice JSON form it dumps to perform more complex validation with ease (e.g., specified backend is valid, dims are correctly formed, etc.).

I think this this the ticket you are referring to. For my own clarity, "all outputs -> all inputs" refers to checking the config.pbtxt file for the ensemble model and verifying the inputs/outputs specified in the ensemble_scheduling steps exist within the submodels, correct?

Copy link
Collaborator Author

@rmccorm4 rmccorm4 Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this this the ticket you are referring to.

Yep that's the one. There are actually two features here - validating an ensemble config is correct (and providing perhaps a more useful error when incorrect, compared to what core outputs), and generating or simplifying the generation of an ensemble config.

For my own clarity, "all outputs -> all inputs" refers to checking the config.pbtxt file for the ensemble model and verifying the inputs/outputs specified in the ensemble_scheduling steps exist within the submodels, correct?

For this statement, I was thinking in terms of generating ensemble configs, rather than validating the. I meant if there was an unambiguous mapping that could be inferred when connecting Model A -> Model B (ex: Model A has 1 output, Model B has 1 input), we could just do it. If there is an ambiguous mapping (ModelA has 3 outputs, ModelB has 2 inputs, which get mapped?), we probably can't do much.

src/triton_cli/main.py Outdated Show resolved Hide resolved
Base automatically changed from rmccormick-vllm to main January 10, 2024 20:03
@fpetrini15
Copy link
Collaborator

LGTM 🎊

@rmccorm4 rmccorm4 merged commit cfb4bbb into main Jan 10, 2024
3 checks passed
@rmccorm4 rmccorm4 deleted the rmccormick-test branch January 10, 2024 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants