Awesome Llama Resources

Llama2 is a part open source commercial model released from Meta, including 7B/13B/70B and chat models with 4096 context window.

Models

[Original Model] 202307 Meta Released Llama2
- Github
- Meta's llama-recipes: provide examples for finetuning at SingleGPU/Multiple GPU and the recipe to convert model to HuggingFace transformers's LLama2 model definition
- Paper: Llama 2: Open Foundation and Fine-Tuned Chat Models
- Download Applications
[Togehter AI] 202307 TogetherAI released Llama2-7B context window with 32k context window based on Meta's research Extending Context Window of Large Language Models via Positional Interpolation
- Together.ai's blog about Preparing for the era of 32K context: Early learnings and explorations
Codellama: Meta finetuned Llama2 for code generation usage. Support C++/ Java/ PHP/ Type Script/ C#/ Bash/ Python generation. Include models 7B/13B/34B，and 3 kind of variation (Generatl/python/instruction). Extend maximum context window from 4,096 tokens to 100k(like claude2).

Demo

Llama2 70B Chatbot at HuggingFace
A16z's Llama2-chatbot: provide a streamlit chatbot app for LLaMA2

Finetune Method/ Scripts

Finetune with PEFT
Finetune together.ai 32k context window model: script to finetune on booksum/mqa dataset
- Llama-2-7B-32K-Instruct — and fine-tuning for Llama-2 models with Together API: Together AI show their 32k context instruct 7b model.
Finetune with QLora at 13b model: a colab about finetuning llama2
HuggingFace SFT training script
Pytorch-lightening's script to finetune Llama2 on custom dataset
Instuction-tune Llama2: HuggingFace's Tech Lead Philschmid introduced how to instruct finetune Llama2
Finetune LLaMA2 7-70B on Amazon SageMaker: Philschmid introduce preparing datasets/using QLoRA/Deploy model on Amazon SageMaker
Finetune LLaMa2 with QLoRA at colab
Fine-tune Llama 2 with DPO by huggingface
Fine-tune Llama2 on specific usage like SQL Gen/Functional Representation: Anyscale's member used their lib ray to demo finetune Llama2 70B.Their scripts

Porting

Karpathy's Llama2.c: Karpathy's weekend project to build a LLama2 at C
web-llm: Bringing large-language models and chat to web browsers
HuggingFace release Swift Transformers to help run LLM on Apple Device: Provide Swift based Swift Transformers Lib, a swift chat app and a exporters for exporting model to coreml.
pyllama: LLaMA: Open and Efficient Foundation Language Models

Tutorial

Meta's started guide to use Llama
Llama2.c for dummies: a description about Karpathy's LLama2 line by line
NeurIPS 2023 LLM Efficiency Challenge Quickstart Guide: A competition focused on training 1 LLM for 24 hours on 1 GPU – the team with the best LLM gets to present their results at NeurIPS 2023.
Huggingface share how to train and deploy an open source LLM?

Prompt

LLama2 prompt template

For specific usage Model/ Finetuned model

Huggingface trend about llama2
Chinese-Llama-2-7b: finetune on a chinese and english instruction dataset with 10 millions size
Chinese-LLaMA-Alpaca
Finetuned on code with qLoRA
ToolLLaMA: An open source project to train LLaMa on ToolBench, to make LLaMa support function call
Llama2-Code-Interpreter: make Llama2 use Code Execution, Debug, Save Code, Reuse it, Access to Internet
Llama2-Medical-Chatbot: A medical bot built using Llama2 and Sentence Transformers
Finetune LLaMA 7B with Traditional Chinese instruction datasets
Taiwan-LLaMa: NTU's MiuLab finetune 13B Llama2 with 5B traditional chinese tokens and 490k instruction dataset.
Finetuning LLaMa + Text-to-SQL: LlamaIndex show how to fine-tune LLaMa 2 7B on a Text-to-SQL dataset

Multimodal LLM

LLaSM: Large Language and Speech Model: Support chinese/english voice chat model based on whisper features
LLaVA : Large Language-and-Vision Assistant
Chinese-LLaVA: support vision input and chinese text input/output

Toolkits

[TogetherAI] OpenChatKit: Together.ai's open toolkit for LLM finetune/moderation
LLaMA2-Accessory:An Open-source Toolkit for LLM Development
- LLaMA-Adapter: Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
text-generation-webui:A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, OPT, and GALACTICA.
text-generation-inference: Huggingface's Large Language Model Text Generation Inference.
FlexFlow Serve: Low-Latency, High-Performance LLM Serving: An open-source compiler and distributed system for low latency, high performance LLM serving.
LLM-As-Chatbot:Use lots of open sourced instruction-following fine-tuned LLM models as a Chatbot service.

Optimiztion(Latency/Size)

Optimizing LLM latency: A great blog about exploration of inference tools for open source LLMs
Series Quantized LLama2 Model from The Bloke with GPTQ/GGML
- TheBloke/llama-2-7B-Guanaco-QLoRA-GPTQ
- OpenAssistant-Llama2-13B-Orca-8K-3319-GPTQ
Quantization
- GPTQ: Accurate Post Training Quantization for generative pre-trained transformers
  - AutoGPTQ: An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Together AI's Medusa to accelerate decoding
NVIDIA TensorRT-LLM Supercharges Large Language Model Inference on NVIDIA H100 GPUs:TensorRT-LLM is an open-source library that accelerates and optimizes inference performance on the latest LLMs on NVIDIA Tensor Core GPUs.
20231130 Pytorch Team use pytorch tool to accelerate

Optimization(Reasoning)

LLM Reasoners: LLM Reasoners is a library to enable LLMs to conduct complex reasoning, with advanced reasoning algorithms.
Deepminds LLM as Optimizers

Use

Run Llama 2 on your own Mac using LLM and Homebrew
Deploy Llama2 7B/13B/70B model on AWS SageMaker: Based on Hugging Face LLM DLC(Deep Learning Container) which is powered by huggingface's text generation inference. HuggingFace's text generation inference is a Rust, Python and gRPC server for text generation inference. Used in production at HuggingFace to power Hugging Chat, the Inference API and Inference Endpoint.

Other Resources

Llama 2 の情報まとめ
LLaMA2-Every Resource you need
LLaMA-efficient-tuning: Easy-to-use fine-tuning framework using PEFT (PT+SFT+RLHF with QLoRA) (LLaMA-2, BLOOM, Falcon, Baichuan)
awesome-llm and aigc
Finetune Falcon-7B on Your GPU with TRL and QLoRA: A blog about tuning falcon-7b on your consumer GPU
A Definitive Guide to QLoRA: Fine-tuning Falcon-7b with PEFT
Amazon sagemaker generativeai: Fine-tune Falcon-40B with QLoRA
Llama with FlashAttention2: Reduces VRAM usage, especially during training.Full finetune Llama 2 7b:51.3->40.3GiB
Anti-hype LLM reading list: A reading list about LLM.

Move on to production

Patterns for Building LLM-based Systems & Products: Amazon's LLM Engineer Eugene Yan wrote a blog about patterns of LLM based system
Finetuning an LLM: RLHF and alternatives
Github:A developer’s guide to prompt engineering and LLMs: Github engineer shares their experiences to to prompt engineering for their copilot product.
The Rise and Potential of Large Language Model Based Agents: A Survey: A survey from Fudan NLP Group about LLM based Agents. Their github repo https://github.com/WooooDyy/LLM-Agent-Paper-List

Evaluation

🤗Open LLM Leaderboard: A huggingface space which track, rank and evaluate LLMs and chatbots as they are released.

Calculation

How is LLaMa.cpp possible: The post showed why Llama is limited by memory bound with some calculations of the transformers parameters.

Some theory

Why we should train smaller LLMs on more tokens
- harms law on hugging face for calculating the model size/dataset size's compute overhead
LLMSurvey: A Survey of Large Language Models
Open challenges in LLM research: Chip Huyen's post about LLM's challenge
Stanford CS324 - Large Language Models: The fundamentals about the modeling, theory, ethics, and systems aspects of large language models.
- CS221:Artificial Intelligence: Principles and Techniques
Why you(Propbably) Don't Need to Fine-tune an LLM: Finetuning maynot reduce hallucinating. You could use few-shot prompting/ Retrieval Augmented Generation(RAG)
Challenges and Applications of Large Language Models

Some basics

Some Intuition on Attention and the Transformer: A post introduces the big deal about attention/what are query,key and value
Intro to transformers

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Llama Resources

Models

Demo

Finetune Method/ Scripts

Porting

Tutorial

Prompt

For specific usage Model/ Finetuned model

Multimodal LLM

Toolkits

Optimiztion(Latency/Size)

Optimization(Reasoning)

Use

Other Resources

Move on to production

Evaluation

Calculation

Some theory

Some basics

About

Releases

Packages

License

MIBlue119/awesome-llama-resources

Folders and files

Latest commit

History

Repository files navigation

Awesome Llama Resources

Models

Demo

Finetune Method/ Scripts

Porting

Tutorial

Prompt

For specific usage Model/ Finetuned model

Multimodal LLM

Toolkits

Optimiztion(Latency/Size)

Optimization(Reasoning)

Use

Other Resources

Move on to production

Evaluation

Calculation

Some theory

Some basics

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages