Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
zhanghy-sketchzh authored Aug 6, 2023
1 parent b26a757 commit 50253ee
Showing 1 changed file with 16 additions and 17 deletions.
33 changes: 16 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,40 +111,40 @@ First we need to extract all the information from the above data such as QA, tab
This data is then expressed in natural language, e.g:

```
{ "instruction": "concert_singer contains tables such as stadium, singer, concert, singer_in_concert. Table stadium has columns such as stadium_id. location, name, capacity, highest, lowest, average. table stadium has columns such as stadium_id, location, name, capacity, highest, lowest, average. stadium_id is the primary key. table singer has columns such as singer_id, name, country, song_name, song_release_year name, song_release_year, age, is_male. singer_id is the primary key. table concert has columns such as concert_id, concert_name, theme, stadium_id. year. Table singer_in_concert has columns such as concert_id, singer_id. concert_id is the primary key. The year of concert is the foreign key of location of stadium. The stadium_id of singer_in_concert is the foreign key of name of singer. concert is the foreign key of concert_name of concert.".
"input": "How many singers do we have?".
"response": "concert_singer | select count(*) from singer"}
{"instruction": "department_management contains tables such as department, head, management. Table department has columns such as department_id, name, creation, ranking, budget_in_billions, num_employees. department_id is the primary key. Table head has columns such as head_id, name, born_state, age. head_id is the primary key. Table management has columns such as department_id, head_id, temporary_acting. department_id is the primary key. The head_id of management is the foreign key of head_id of head. The department_id of management is the foreign key of department_id of department.",
"input": "How many heads of the departments are older than 56 ?",
"output": "select count(*) from head where age > 56"}
```

The code implementation of the above data pre-processing section is as follows:

```bash
## 生成train数据
## Generate train data
python dbgpt_hub/utils/sql_data_process.py

## 生成dev数据
## Generate dev data
python dbgpt_hub/utils/sql_data_process.py \
--data_filepaths data/spider/dev.json \
--output_file dev_sql.json \

```

When fine-tuning the model, we also customize the prompt dict to optimize the input:

``` python
SQL_PROMPT_DICT = {
"prompt_input": (
"I want you to act as a SQL terminal in front of an example database. "
"Below is an instruction that describes a task, Write a response that appropriately completes the request.\n\n"
"##Instruction:\n{instruction}\n\n###Input:\n{input}\n\n###Response: "
).
"I want you to act as a SQL terminal in front of an example database, \
you need only to return the sql command to me.Below is an instruction that describes a task, \
Write a response that appropriately completes the request.\n\n \
The instruction is {instruction}, So please tell me {input}, ###Response:"
),
"prompt_no_input": (
"I want you to act as a SQL terminal in front of an example database. "
"Below is an instruction that describes a task, Write a response that appropriately completes the request.\n\n"
"####Instruction:\n{instruction}\n\n### Response: "
).
"I want you to act as a SQL terminal in front of an example database, \
you need only to return the sql command to me.Below is an instruction that describes a task, \
Write a response that appropriately completes the request.\n\n \
The instruction is {instruction}, ###Response:"
),
}

```

### 3.3. Model fine-tuning
Expand Down Expand Up @@ -181,8 +181,7 @@ To evaluate model performance on the dataset, default is spider dataset.
Run the following command:

```bash
cd eval
python evaluation.py --plug_value --input Your_model_pred_file
python eval/evaluation.py --plug_value --input
```

## 4. RoadMap
Expand Down

0 comments on commit 50253ee

Please sign in to comment.