[BOUNTY] Add Qwen1.5 0.5B #37

JushBJJ · 2024-03-30T18:25:53Z

This PR is with the help of @marty1885 and @JonathanALevine.
Bounty: #20

The following issues need to be addressed:

Video of it working: https://www.youtube.com/watch?v=Tlb1ElBHqVk

Requirements

Transformers >=4.42.0
~12 GB of RAM (Host)

JushBJJ · 2024-04-14T06:33:16Z

With all these patches, we internally have Qwen1.5 0.5B working on a Grayskull e150 (thanks @JonathanALevine). We're waiting for @marty1885 to test all of this in his e75. Should we convert this out of a draft PR?

@nvukobratTT @milank94

JushBJJ · 2024-04-14T06:36:02Z

A lot of optimizations will need to be made to speed this model up, stay tuned. I gotta do my uni assignments which are due tonight 😭

nvukobratTT · 2024-04-15T07:57:04Z

Great work @JushBJJ @marty1885 @JonathanALevine! Let us know once you get silicon results out of E75 :)))

We're working out some internal checking, as soon as we finalize those on our side we're going to complete code reviews and start with PR approvals.

Once again great progress and thanks for hunting these bounties! :)))

marty1885 · 2024-04-15T14:54:33Z

I think there's some problems with the e75 - I ran inference for an hour but it never finished. Hope this doesn't affect the bounty as it works on the e150. It feels more like a low level compilation bug.

NVM. IT WORKS!! Just it's so slow that running it with the default settings takes forever. I can get it finish in a reasonable time if I set max_length to something short. Like 2.

JushBJJ · 2024-07-26T01:23:29Z

@nvukobratTT @milank94 Update :)

milank94 · 2024-07-26T11:59:43Z

@JushBJJ way to go!

As we don't have automated CI setup yet for this project, we'll manually test the model.

Before that, there are a few miscellaneous components that should be added in such as update support table, add test case, add license header, etc. You can see a breakdown here: https://github.com/tenstorrent/tt-buda-demos/blob/main/CONTRIBUTING.md#adding-models

Mind adding these details in and we'll continue the review process?

JushBJJ · 2024-07-26T12:49:42Z

Sure I'll do it too with the Phi 2 bounty as well, let me know if you guys had any issue running it.
Also before manually testing it, add these workarounds:
JushBJJ/tt-buda@f765838

I raised an issue (tenstorrent/tt-buda#42) because I'm not quite sure how a proper implementation would look like for DynamicCaching so it would be great if anyone in the TT team familiar with it can make a proper fix :)

milank94 · 2024-07-26T13:09:01Z

Sure I'll do it too with the Phi 2 bounty as well, let me know if you guys had any issue running it. Also before manually testing it, add these workarounds: JushBJJ/tt-buda@f765838

I raised an issue (tenstorrent/tt-buda#42) because I'm not quite sure how a proper implementation would look like for DynamicCaching so it would be great if anyone in the TT team familiar with it can make a proper fix :)

Thanks for the update. We'll get someone to review and integrate the work arounds. We can look to package this as an alpha release @staylorTT

FYI: @Shubhamsaboo for bounty tracking

JushBJJ · 2024-07-28T11:54:34Z

Made some changes:

Cleaned up code and added batching like the other examples
Renamed filename for qwen
Adjusted parameters to reduce weird token generation
Added chat version of qwen1.5
Added test case for qwen1.5

I also tested on previous transformers versions, seems like v4.42.0 is some magic number??? Any version below that Qwen does not work at all.

milank94 · 2024-07-28T17:03:42Z

Made some changes:

Cleaned up code and added batching like the other examples

Renamed filename for qwen

Adjusted parameters to reduce weird token generation

Added chat version of qwen1.5

Added test case for qwen1.5

I also tested on previous transformers versions, seems like v4.42.0 is some magic number??? Any version below that Qwen does not work at all.

Thanks @JushBJJ . I will have a look.

That's good to note. The latest release of Buda is on transformers==4.41.0, so will need to uplift it in tt-buda or add it here as a temporary requirement.

model_demos/nlp_demos/qwen1_5/pytorch_qwen1_5.py

model_demos/nlp_demos/qwen1_5/pytorch_qwen1_5_chat.py

JushBJJ · 2024-07-30T22:39:28Z

See tenstorrent/tt-buda#42 (comment)

I'll rewrite Qwen's file with a custom wrapper/generate forward function

JushBJJ · 2024-08-01T00:35:25Z

This should be good now for final review and testing.

See tenstorrent/tt-buda#42 (comment) for my comment on disabling DynamicCache on new models

JushBJJ · 2024-08-09T02:15:11Z

Any updates on this on TT's side? @milank94

milank94 · 2024-08-09T11:12:43Z

Any updates on this on TT's side? @milank94

Hey @JushBJJ -- we're currently working on integrating all of necessary changes into a new Buda release. This should add the support for this model and the Phi-2, along with some additional features.

Once we have that RC ready, we'll do the test and merging of these models.

Thanks for the patience.

milank94 · 2024-08-30T16:23:56Z

Hey @JushBJJ can you switch the target branch to be: mkordic/rc_20240830?

JushBJJ · 2024-08-31T01:08:12Z

Done

milank94 · 2024-09-03T11:23:50Z

Looks great @JushBJJ - can you address the merge conflict around model_demos/pyproject.toml?

I merged the changes for #117, I think this branch just needs a rebase onto mkordic/rc_20240830.

The plan is to merge these two models into a RC branch (mkordic/rc_20240830), that we are using to updated the changes for the next TT-Buda release, then merge into main and complete the bounties.

JushBJJ · 2024-09-03T11:58:15Z

@milank94 Rebased it, i normally dont do it so I hope i did it right 😅

* Qwen1.5 0.5B pybuda implementation * remove unneeded requirement * Update env vars and compiler configs * remove undefined device_map * Remove misleading and unnecessary environment variables * Refine qwen solution * Rename qwen file * Rename qwen filename and added qwen1.5-chat * Add qwen1.5 test case * Fix typo in pyproject.toml * Disable dynamic caching * Add extra whitespace below model title commment * Fix typo "moadl" to "model"

This was referenced Mar 30, 2024

Support triu function for tvm.relay.expr.Call Inputs tenstorrent/tt-tvm#2

Closed

No proper support for aten::triu OP inputs when inputs are CallNodes [Bug] tenstorrent/tt-tvm#3

Closed

nvukobratTT requested review from nvukobratTT and milank94 April 5, 2024 20:18

This was referenced Apr 14, 2024

Allow masked_fill to handle non-boolean/hot tensor masks tenstorrent/tt-tvm#4

Merged

Decompose tril ops with dynamic diagonal argument tenstorrent/tt-tvm#5

Closed

Shubhamsaboo assigned JushBJJ Apr 15, 2024

JushBJJ mentioned this pull request Jul 26, 2024

DynamicCache for models is not supported tenstorrent/tt-buda#42

Closed

JushBJJ marked this pull request as ready for review July 26, 2024 01:19

JushBJJ mentioned this pull request Jul 26, 2024

[$$ BOUNTY] Add Qwen 1.5 (0.5B) model to TT-Buda Model Demos #20

Closed

milank94 reviewed Jul 28, 2024

View reviewed changes

model_demos/nlp_demos/qwen1_5/pytorch_qwen1_5.py Show resolved Hide resolved

milank94 reviewed Jul 28, 2024

View reviewed changes

model_demos/nlp_demos/qwen1_5/pytorch_qwen1_5_chat.py Show resolved Hide resolved

milank94 mentioned this pull request Jul 29, 2024

[BOUNTY] Add Phi-2 #117

Merged

milank94 added the bounty label Jul 29, 2024

milank94 self-assigned this Jul 31, 2024

JushBJJ changed the base branch from main to mkordic/rc_20240830 August 31, 2024 00:38

milank94 approved these changes Sep 3, 2024

View reviewed changes

JushBJJ and others added 13 commits September 3, 2024 11:47

Qwen1.5 0.5B pybuda implementation

27b5ccd

remove unneeded requirement

e2586e3

Update env vars and compiler configs

b739fd7

remove undefined device_map

bcc4676

Remove misleading and unnecessary environment variables

2954596

Refine qwen solution

a4b99b5

Rename qwen file

401f07a

Rename qwen filename and added qwen1.5-chat

29fe4ce

Add qwen1.5 test case

d8c3a5b

Fix typo in pyproject.toml

709c9e4

Disable dynamic caching

7ede70f

Add extra whitespace below model title commment

d115953

Fix typo "moadl" to "model"

abe36e7

JushBJJ force-pushed the main branch from 3064202 to abe36e7 Compare September 3, 2024 11:53

milank94 merged commit 2c6b89e into tenstorrent:mkordic/rc_20240830 Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BOUNTY] Add Qwen1.5 0.5B #37

[BOUNTY] Add Qwen1.5 0.5B #37

JushBJJ commented Mar 30, 2024 •

edited

Loading

JushBJJ commented Apr 14, 2024 •

edited

Loading

JushBJJ commented Apr 14, 2024

nvukobratTT commented Apr 15, 2024

marty1885 commented Apr 15, 2024 •

edited

Loading

JushBJJ commented Jul 26, 2024

milank94 commented Jul 26, 2024

JushBJJ commented Jul 26, 2024

milank94 commented Jul 26, 2024

JushBJJ commented Jul 28, 2024

milank94 commented Jul 28, 2024

JushBJJ commented Jul 30, 2024 •

edited

Loading

JushBJJ commented Aug 1, 2024 •

edited

Loading

JushBJJ commented Aug 9, 2024

milank94 commented Aug 9, 2024

milank94 commented Aug 30, 2024

JushBJJ commented Aug 31, 2024

milank94 commented Sep 3, 2024

JushBJJ commented Sep 3, 2024

[BOUNTY] Add Qwen1.5 0.5B #37

[BOUNTY] Add Qwen1.5 0.5B #37

Conversation

JushBJJ commented Mar 30, 2024 • edited Loading

Requirements

JushBJJ commented Apr 14, 2024 • edited Loading

JushBJJ commented Apr 14, 2024

nvukobratTT commented Apr 15, 2024

marty1885 commented Apr 15, 2024 • edited Loading

JushBJJ commented Jul 26, 2024

milank94 commented Jul 26, 2024

JushBJJ commented Jul 26, 2024

milank94 commented Jul 26, 2024

JushBJJ commented Jul 28, 2024

milank94 commented Jul 28, 2024

JushBJJ commented Jul 30, 2024 • edited Loading

JushBJJ commented Aug 1, 2024 • edited Loading

JushBJJ commented Aug 9, 2024

milank94 commented Aug 9, 2024

milank94 commented Aug 30, 2024

JushBJJ commented Aug 31, 2024

milank94 commented Sep 3, 2024

JushBJJ commented Sep 3, 2024

JushBJJ commented Mar 30, 2024 •

edited

Loading

JushBJJ commented Apr 14, 2024 •

edited

Loading

marty1885 commented Apr 15, 2024 •

edited

Loading

JushBJJ commented Jul 30, 2024 •

edited

Loading

JushBJJ commented Aug 1, 2024 •

edited

Loading