-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[$$ BOUNTY] Add Phi-2 (2.7B) Model to TT-Buda Model Demos #21
Comments
phi-3-mini is now released: The 3.8 billion parameter model should easily fit on the 8GB LPDDR4 if quantized to 8-bit. Maybe the bounty could be updated for Phi-3. |
Phi-3-mini weights are released: https://huggingface.co/collections/microsoft/phi-3-6626e15e9585a200d2d761e3 |
hello, would love to give it a shot. could you please assign me to the issue? (both phi2 or phi3 works) |
Sounds good, done @edgerunnergit! I recommend you to check with the community members on Discord, if somebody is actually working on it. Would help you with the start or you can even tag team to solve this challenge. |
@edgerunnergit do you have something to share at this point already? |
@EwoutH I had a rough look but couldn't actually start yet. |
Microsoft released some new Phi-3 models!
The vision model is only 4.15B params so that should be able to run on 8GB Grayskull cards when quantized to 8 bit. Medium is likely too large (14B), but small (7.39B) might fit using the block floating point format Grayskull supports, BFP4 (see also #59). |
Got my Grayskull e75 this week, started working on phi-3-mini-4k and will share any progress. |
FYI: @jush has opened the PR for adding Phi-2 and will be able to add it to the model demo. Let me know if any of you are working on Phi-3. |
PR for Phi 2: |
Claimed by @JushBJJ. It will be closed once it's merged into main. Congrats Jush! |
Awesome news! Would also really love phi-3.5-mini-instruct support, as a potential next goal. It's 3.82B params (so a bit bigger than the 2.78B Phi-2), but also way more capable, with really impressive results for such a small model. But great to hear that "small" LLMs are now running on Tenstorrent hardware! Very curious how fast it is (tokens/s) at which power level (watt) and efficiency (tokens/watt). |
Background:
TT-Buda, developed by Tenstorrent, is a growing collection of model demos showcasing the capabilities of AI models running on Tenstorrent hardware. These demonstrations cover a wide range of applications, aiming to provide insights and inspiration for developers and researchers interested in advanced AI implementations.
Bounty Objective:
We are excited to announce a bounty for contributing a new AI model demonstration to the TT-Buda repository. This is an opportunity for AI enthusiasts, researchers, and developers to showcase their skills, contribute to cutting-edge AI research, and earn rewards.
Task Details:
Integrate Phi-2 (2.7B) into the TT-Buda demonstrations.
Requirements:
Contribution Guidelines:
model_demos
folder following the naming convention:model_yourModelName
.CONTRIBUTING.md
file.Evaluation Criteria:
Rewards:
Contributions will be evaluated by the Tenstorrent team, and the best contribution will be eligible for $500 cash bounty.
Get Started with Grayskull DevKit
Dive into AI development with the Grayskull DevKit, your gateway to exploring Tenstorrent's hardware. Paired with TT-Buda and TT-Metalium software approaches, it offers a solid foundation for AI experimentation. Secure your kit here.
Connect on Discord
Join our Discord to talk AI, share your journey, and get support from the Tenstorrent community and team. Let's innovate together!
The text was updated successfully, but these errors were encountered: