Skip to content

A simple backend implemented with fastapi for deploying large language model chatbot locally

Notifications You must be signed in to change notification settings

XiPotatonium/chatbot-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

chatbot-api

SEE examples.ipynb for request examples.

Now support:

Setup

conda create -n llmapi python=3.8
conda activate llmapi
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
pip install -r requirements.txt

Run

uvicorn src:app --reload

chatbot-api supports model scheduling:

  1. idle model instances will be closed
  2. new model instances will be created if too many concurrent requests

You can modify sched_config.json to change the scheduling strategy and model instances.

A typical config is:

{
    "idle_check_period": 120,               // check idle models and close them every 120 seconds
    "models": {
        "blip2zh-chatglm-6b": {             // modelname should be the same as the config filename under cfgs/
            "max_instances": 1,             // at most 1 instance will be created
            "idle_time": 3600,              // if no request for 1 hours, the instance will be closed
            "create_threshold": {           // if 5 requests request blip2zh-chatglm-6b in 5 seconds,
                "n_requests": 5,            //    1 more instance will be created (not exceeding max_instances)
                "delay": 5
            }
        }
    }
}

Format

Request format

{
  "model": "chatglm-6b",
  "messages": [{"role": "user", "content": "Hello!"}],
  "stream": true,
  "max_tokens": 1024,
}

Response format

A typical response:

{
    "choices": [{"index": 0, "message": {"role": "assistant", "content": "Hello! How can I help you today?"}}]
}

You may refer to examples.ipynb for more examples.

About

A simple backend implemented with fastapi for deploying large language model chatbot locally

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published