Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stable-diffusion model wrapper #438

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open
129 changes: 129 additions & 0 deletions examples/conversation_with_stablediffusion_model/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# Conversation with Stable-diffusion model

This example will show

- How to use Stable Diffusion models in AgentScope.

In this example, you can interact in a conversational format to generate images.
Once the image is generated, the agent will respond with the local file path where the image is saved.

## Minimum Hardware Requirements

- **GPU**: NVIDIA GPU with at least 6.9GB of VRAM
- **CPU**: Modern multi-core CPU (e.g., Intel i5 or AMD Ryzen 5)
- **RAM**: Minimum 8GB
- **Storage**: At least 10GB of available hard drive space

## How to Run

You need to satisfy the following requirements to run this example:

### Step 0: Install Stable Diffusion Web UI and AgentScope

- Install Stable Diffusion Web UI by following the instructions at [AUTOMATIC1111/stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui).
- Install the latest version of AgentScope by
```bash
git clone https://github.com/modelscope/agentscope.git
cd agentscope
pip install -e .
```

### Step 1: Download the required checkpoints

Before starting the Stable Diffusion Web UI, you need to download at least one model to ensure normal operation.
Download the model to `stable-diffusion-webui/models/Stable-diffusion` directory.

### Step 2: Launch the Stable Diffusion Web UI

We've provided a convenient shell script to quickly start the Stable Diffusion Web UI:
`scripts/stable_diffusion_webui/sd_setup.sh`

Activate the virtual environment first, Then, run the following command in your terminal, replacing YOUR-SD-WEBUI-PATH with the actual path to your Stable Diffusion Web UI directory:

```bash
bash scripts/stable_diffusion_webui/sd_setup.sh -s YOUR-SD-WEBUI-PATH
```

If you choose to start it on your own, you need to launch the Stable Diffusion Web UI with the following arguments: `--api --port=7862`. For more detailed instructions on starting the WebUI, refer to the [AUTOMATIC1111/stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui).

### Step 3: Running the Example

Run the example and input your prompt.

```bash
python conversation_with_stablediffusion_model.py
```

## Customization Options

### `model_config` Example:

```json
{
"model_type": "sd_txt2img",
"config_name": "sd",
"options": {
"sd_model_checkpoint": "Anything-V3.0-pruned",
"sd_lora": "add_detail",
"CLIP_stop_at_last_layers": 2
},
"generate_args": {
"steps": 50,
"n_iter": 1,
"override_settings": {
"CLIP_stop_at_last_layers": 3
}
}
}
```

### Parameter Explanation:

- `options`: Global configuration that directly affects the WebUI settings.
- `generate_args`: Controls parameters for individual image generation requests, such as `steps` (number of sampling steps) and `n_iter` (number of iterations).
- `override_settings`: Overrides WebUI settings for a single request, taking precedence over `options`.

Notes:

- `override_settings` only affects the current request, while changes made to `options` persist.
- Both parameters can set the same options, but `override_settings` has a higher priority.

As shown in the example, the final image will be generated with the following settings:

steps: 50
n_iter: 1
sd_model_checkpoint: Anything-V3.0-pruned
sd_lora: add_detail
CLIP_stop_at_last_layers: 3

However, the web UI will always display the following settings:

sd_model_checkpoint: Anything-V3.0-pruned
sd_lora: add_detail
CLIP_stop_at_last_layers: 2

### Available Parameter Lists:

If you've successfully enabled the Stable Diffusion Web UI API, you should be able to access its documentation at http://127.0.0.1:7862/docs (or whatever URL you're using + /docs).

- `generate_args`: {url}/docs#/default/text2imgapi_sdapi_v1_txt2img_post
- `options` and `override_settings`: {url}/docs#/default/get_config_sdapi_v1_options_get

For this project, the "options" parameter will be posted to the /sdapi/v1/options API endpoint,
and the "generate_args" parameter will be posted to the /sdapi/v1/txt2img API endpoint.
You can refer to https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/API for a more parameter reference guide.

## A Running Example

- Conversation history with Stable Diffusion Web UI.
```bash
User input:Horses on Mars
User: Horses on Mars
Assistant: Image saved to path\agentscope\runs\run_20240920-142208_rqsvhh\file\image_20240920-142522_HTF38X.png
User input: boy eating ice-cream
User: boy eating ice-cream
Assistant: Image saved to path\agentscope\runs\run_20240920-142208_rqsvhh\file\image_20240920-142559_2xGtUs.png
```
- Image
<img src="https://img.alicdn.com/imgextra/i3/O1CN01YoMRQP26ClOHM7Kh0_!!6000000007626-0-tps-512-512.jpg" alt="Horses on Mars" width="300" />
<img src="https://img.alicdn.com/imgextra/i1/O1CN01QTO8AU1HVxaQ2rFPx_!!6000000000764-0-tps-512-512.jpg" alt="boy eating ice-cream" width="300" />
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# -*- coding: utf-8 -*-
"""conversation between user and stable-diffusion agent."""
import agentscope
from agentscope.agents import DialogAgent
from agentscope.agents.user_agent import UserAgent


def main() -> None:
"""A basic conversation demo"""

agentscope.init(
model_configs=[
{
"model_type": "sd_txt2img",
"config_name": "sd",
"options": {
"sd_model_checkpoint": "xxxxxx",
"CLIP_stop_at_last_layers": 2,
},
"generate_args": {
"steps": 50,
"n_iter": 1,
},
},
],
project="txt2img-Agent Conversation",
save_api_invoke=True,
)

# Init two agents
dialog_agent = DialogAgent(
name="Assistant",
sys_prompt="dreamy", # replace by your image style prompts
model_config_name="sd", # replace by your model config name
)
user_agent = UserAgent()

# start the conversation between user and assistant
msg = None
while True:
msg = user_agent(msg)
if msg.content == "exit":
break
msg = dialog_agent(msg)


if __name__ == "__main__":
main()
14 changes: 14 additions & 0 deletions scripts/stable_diffusion_webui/model_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"model_type": "sd_txt2img",
"config_name": "stable_diffusion_txt2img",
"host": "127.0.0.1:7862",
"options": {
"sd_model_checkpoint": "Anything-V3.0-pruned",
"sd_lora": "add_detail",
"CLIP_stop_at_last_layers": 2
},
"generate_args": {
"steps": 50,
"n_iter": 1
}
}
34 changes: 34 additions & 0 deletions scripts/stable_diffusion_webui/sd_setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#!/bin/bash

# set VENV_DIR=%~dp0%venv
# call "%VENV_DIR%\Scripts\activate.bat"

# stable_diffusion_webui_path="YOUR_PATH_TO_STABLE_DIFFUSION_WEBUI"

port=7862

while getopts ":p:s:" opt
do
# shellcheck disable=SC2220
case $opt in
p) port="$OPTARG";;
s) stable_diffusion_webui_path="$OPTARG"
;;
esac
done

stable_diffusion_webui_path=${stable_diffusion_webui_path%/}
launch_py_path="$stable_diffusion_webui_path/launch.py"

# Check if the launch.py script exists
if [[ ! -f "$launch_py_path" ]]; then
echo "The launch.py script was not found at $launch_py_path."
echo "Please ensure you have specified the correct path to your Stable Diffusion WebUI using the -s option."
echo "Example: ./sd_setup.sh -s /path/to/your/stable-diffusion-webui"
echo "Alternatively, you can set the path directly in the script."
exit 1
fi

cd $stable_diffusion_webui_path

python ./launch.py --api --port=$port
3 changes: 3 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@
extra_litellm_requires = ["litellm"]
extra_zhipuai_requires = ["zhipuai"]
extra_ollama_requires = ["ollama>=0.1.7"]
extra_sd_webuiapi_requires = ["webuiapi"]

# Full requires
extra_full_requires = (
Expand All @@ -102,6 +103,7 @@
+ extra_litellm_requires
+ extra_zhipuai_requires
+ extra_ollama_requires
+ extra_sd_webuiapi_requires
)

# For online workstation
Expand Down Expand Up @@ -140,6 +142,7 @@
"litellm": extra_litellm_requires,
"zhipuai": extra_zhipuai_requires,
"gemini": extra_gemini_requires,
"stablediffusion": extra_sd_webuiapi_requires,
# For service functions
"service": extra_service_requires,
# For distribution mode
Expand Down
4 changes: 4 additions & 0 deletions src/agentscope/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,9 @@
from .yi_model import (
YiChatWrapper,
)
from .stablediffusion_model import (
StableDiffusionImageSynthesisWrapper,
)

__all__ = [
"ModelWrapperBase",
Expand All @@ -64,6 +67,7 @@
"ZhipuAIEmbeddingWrapper",
"LiteLLMChatWrapper",
"YiChatWrapper",
"StableDiffusionImageSynthesisWrapper",
]


Expand Down
Loading
Loading