Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config the vLLM engineArgs in config.pbtxt #63

Closed
wants to merge 17 commits into from

Conversation

activezhao
Copy link

We get vLLm engineArgs from vllm_engine_args.json before, now, we can get them from config.pbtxt.

@activezhao activezhao changed the title change the configuration of engineArgs in config.pbtxt config the vLLM engineArgs in config.pbtxt Oct 19, 2023
fpetrini15 and others added 8 commits October 23, 2023 17:50
)

* Initial Commit

* Mount model repo so changes reflect, parameter tweaking, README file

* Image name error

* Incorporating review comments. Separate docker and model repo builds, add README, restructure repo

* Tutorial restructuring. Using static model configurations

* Bump triton container and update README

* Remove client script

* Incorporating review comments

* Modify WIP line in vLLM tutorial

* Remove trust_remote_code parameter from falcon model

* Removing Mistral

* Incorporating Feedback

* Change input/output names

* Pre-commit format

* Different perf_analyzer example, config file format fixes

* Deep dive changes to Triton tools section

* Remove unused variable
Added Llama2 tutorial for TensorRT-LLM backend
…ference-server#65)

* Updated vLLM tutorial's README to use vllm container

---------

Co-authored-by: dyastremsky <[email protected]>
@oandreeva-nv
Copy link
Collaborator

Hi @activezhao , may I kindly ask you to re-base your PR on top of the main branch and send us a CLA: https://github.com/triton-inference-server/server/blob/main/CONTRIBUTING.md#contributor-license-agreement-cla

@activezhao
Copy link
Author

activezhao commented Nov 22, 2023

Hi @activezhao , may I kindly ask you to re-base your PR on top of the main branch and send us a CLA: https://github.com/triton-inference-server/server/blob/main/CONTRIBUTING.md#contributor-license-agreement-cla

Hi @oandreeva-nv OK, I will do it.

But, I find that the structure of Quick_Deploy/vLLM has changed a lot, will this pr still be OK?

@activezhao
Copy link
Author

activezhao commented Nov 22, 2023

Hi @activezhao , may I kindly ask you to re-base your PR on top of the main branch and send us a CLA: https://github.com/triton-inference-server/server/blob/main/CONTRIBUTING.md#contributor-license-agreement-cla

Hi @oandreeva-nv Because the rebase involves too many files, I directly opened a new PR #72, and I have sent the CLA email.

Could you please close this PR and do CR in the new PR?

Thanks.

@oandreeva-nv
Copy link
Collaborator

Closing this PR per @activezhao request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants