-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable LoRA support for HPU #170
Conversation
@kzawora-intel please review this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SanjuCSudhakaran @hlahkar please check review comments. Thanks.
I ran some multi-LoRA tests on this branch and wanted to share a few additional issues that I encountered, hoping it might help. [1] In Error message w/ 'VLLM_SKIP_WARMUP=false pytest tests/lora/test_llama_hpu.py'============================= HABANA PT BRIDGE CONFIGURATION ===========================
|
Thank you @JHLEE17 for identifying the issues. We had also identified the same issues internally, and are currently working on a fix. |
@afierka-intel @madamczykhabana please help review this PR. |
6600367
to
08721a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have got a few clarification questions and suggestions to reorganize code a little bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for applying all my suggestions. Code looks good for me now. I'll merge the PR once you resolve merging conflicts and pass all static checks.
Squashed commit of the following: commit 7beaeba commit 549bffb commit 2769fd8 commit 1911f44 commit e154e3c commit 220460d commit 1256be5 commit 03d6bc3 commit 4b7468c commit b7d2d86 commit 712a7ed commit 1ee15b4 commit 5c6a312 commit ccb0569 commit c10afb4 commit 6b3a039 commit 4ef5a6d commit 301579d commit ed98772 commit 55c82ba commit d7dddc9 commit 7cc2b99 commit e120246
...Also update test reference for test_multilora_hpu.py
...to fix accuracy mismatch between tp_size = 1 vs tp_size > 1
4221f2e
to
78436a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for resolving conflicts! All static checks and internal e2e tests passed. Approving the PR.
@JHLEE17 PR is merged to habana_main. 1x, 2x, 4x tests in tests/lora/test_llama_hpu.py and tests/lora/test_multilora_hpu.py are all passing with following caveats,
|
This PR enables LoRA support in HPU. * Implemented custom BGMV for LoRA modules using index-select operator. * Support for both single and multi card scenarios has been tested --------- Co-authored-by: Himangshu Lahkar <[email protected]> Co-authored-by: Himangshu Lahkar <[email protected]>
This PR enables LoRA support in HPU.