Skip to content

Welcome to the 🐶 InstructLab Project

Quick Links: Documentation | FAQ | Hugging Face | Calendar | Slack | YouTube | X

Banner

InstructLab is an open source, accessible, and model-agnostic AI project that facilitates contributions to existing large language models (LLMs). Our community's mission is to enable anyone to shape the future of generative AI.

Get started

Why InstructLab

Many projects are rapidly embracing and extending permissively-licensed AI models, but they face three main challenges:

  • Direct contributions to LLMs are not possible. They show up as forks, which are expensive for model creators to maintain, and force consumers to choose a “best-fit” model that isn’t easily extensible.
  • The barrier to entry is high. One’s ability to contribute ideas is limited by their AI/ML expertise. One has to learn how to fork, train, and refine models to see their idea move forward.
  • There is no direct community governance or best practice around review, curation, and distribution of forked models.

InstructLab solves these problems by using Large-Scale Alignment for ChatBots [1] (LAB), a new alignment tuning method for LLMs that leverages synthetic data.

InstructLab's model-agnostic technology gives model upstreams the ability to regularly create builds of their open-source-licensed models. This is achieved by composing new skills and knowledge into the model, as opposed to rebuilding and retraining it.

Take a look at LAB-enhanced models on the InstructLab Hugging Face page.

Additional information

To learn more about InstructLab’s origins, visit the About Taxonomy page.

[1] Shivchander Sudalairaj*, Abhishek Bhandwaldar*, Aldo Pareja*, Kai Xu, David D. Cox, Akash Srivastava*. "LAB: Large-Scale Alignment for ChatBots", arXiv preprint arXiv: 2403.01081, 2024. (* denotes equal contributions)

Pinned Loading

  1. instructlab instructlab Public

    InstructLab Command-Line Interface. Use this to chat with a model and execute the InstructLab workflow to train a model using custom taxonomy data.

    Python 823 307

  2. taxonomy taxonomy Public

    Taxonomy tree that will allow you to create models tuned with your data

    Python 183 699

  3. community community Public

    InstructLab Community wide collaboration space including contributing, security, code of conduct, etc

    Python 71 42

  4. dev-docs dev-docs Public

    Developer documents for the InstructLab organization

    Makefile 2 29

Repositories

Showing 10 of 18 repositories
  • training Public

    InstructLab Training Library - Efficient Fine-Tuning with Message-Format Data

    instructlab/training’s past year of commit activity
    Python 14 Apache-2.0 41 36 (3 issues need help) 16 Updated Oct 19, 2024
  • sdg Public

    Python library for Synthetic Data Generation

    instructlab/sdg’s past year of commit activity
    Python 18 Apache-2.0 33 41 (1 issue needs help) 5 Updated Oct 18, 2024
  • instructlab Public

    InstructLab Command-Line Interface. Use this to chat with a model and execute the InstructLab workflow to train a model using custom taxonomy data.

    instructlab/instructlab’s past year of commit activity
    Python 823 Apache-2.0 307 248 (20 issues need help) 81 Updated Oct 18, 2024
  • .github Public

    InstructLab GitHub organization community files.

    instructlab/.github’s past year of commit activity
    Makefile 2 Apache-2.0 11 0 0 Updated Oct 18, 2024
  • dev-docs Public

    Developer documents for the InstructLab organization

    instructlab/dev-docs’s past year of commit activity
    Makefile 2 Apache-2.0 29 12 19 Updated Oct 18, 2024
  • community Public

    InstructLab Community wide collaboration space including contributing, security, code of conduct, etc

    instructlab/community’s past year of commit activity
    Python 71 Apache-2.0 42 13 4 Updated Oct 18, 2024
  • website Public
    instructlab/website’s past year of commit activity
    TypeScript 0 CC-BY-4.0 22 8 7 Updated Oct 18, 2024
  • eval Public

    Python library for Evaluation

    instructlab/eval’s past year of commit activity
    Python 5 Apache-2.0 17 4 4 Updated Oct 18, 2024
  • ui Public

    Place to hack on UI for InstructLab

    instructlab/ui’s past year of commit activity
    TypeScript 12 Apache-2.0 31 47 (4 issues need help) 19 Updated Oct 15, 2024
  • taxonomy Public

    Taxonomy tree that will allow you to create models tuned with your data

    instructlab/taxonomy’s past year of commit activity
    Python 183 Apache-2.0 698 5 16 Updated Oct 10, 2024