chatbot-api

SEE examples.ipynb for request examples.

Now support:

llama. cfgs/llama-7b.json
llama with lora. cfgs/llama-7b-lora.json
chatglm. cfgs/chatglm-6b.json
InstructGLM. cfgs/chatglm-6b-alpaca-lora.json
blip2chatglm. cfgs/blip2zh-chatglm-6b.json.

Setup

conda create -n llmapi python=3.8
conda activate llmapi
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
pip install -r requirements.txt

Run

uvicorn src:app --reload

chatbot-api supports model scheduling:

idle model instances will be closed
new model instances will be created if too many concurrent requests

You can modify sched_config.json to change the scheduling strategy and model instances.

A typical config is:

{
    "idle_check_period": 120,               // check idle models and close them every 120 seconds
    "models": {
        "blip2zh-chatglm-6b": {             // modelname should be the same as the config filename under cfgs/
            "max_instances": 1,             // at most 1 instance will be created
            "idle_time": 3600,              // if no request for 1 hours, the instance will be closed
            "create_threshold": {           // if 5 requests request blip2zh-chatglm-6b in 5 seconds,
                "n_requests": 5,            //    1 more instance will be created (not exceeding max_instances)
                "delay": 5
            }
        }
    }
}

Format

Request format

{
  "model": "chatglm-6b",
  "messages": [{"role": "user", "content": "Hello!"}],
  "stream": true,
  "max_tokens": 1024,
}

Response format

A typical response:

{
    "choices": [{"index": 0, "message": {"role": "assistant", "content": "Hello! How can I help you today?"}}]
}

You may refer to examples.ipynb for more examples.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
cfgs		cfgs
src		src
.gitignore		.gitignore
README.md		README.md
examples.ipynb		examples.ipynb
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
sched_config.json		sched_config.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

chatbot-api

Setup

Run

Format

Request format

Response format

About

Releases

Packages

Languages

XiPotatonium/chatbot-api

Folders and files

Latest commit

History

Repository files navigation

chatbot-api

Setup

Run

Format

Request format

Response format

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages