Vicuna-13B results #24

yzc111 · 2024-03-08T02:40:15Z

Hello, when I reproduce the results on Vicuna-13B and Llams2-7B , I can not get any model output, and the code outputs the warning:"Prompt exceeds max length and return an empty string as answer. If this happens too many times, it is suggested to make the prompt shorter", How to deal with this phenomenon? Thank you~

gaotianyu1350 · 2024-03-12T11:58:02Z

Hi,

Which config are you using? Vicuna and llama2 models have a 4k context window limit, which limits how many passages you can use in the context.

yzc111 · 2024-03-12T12:01:13Z

Hi, thank you for your reply, the config is 2 shot, 3 ndoc

gaotianyu1350 · 2024-03-12T12:03:10Z

Did you use the "light instruction" version as well?

yzc111 · 2024-03-12T12:04:18Z

NO, I just use the default setting

gaotianyu1350 · 2024-03-12T12:17:03Z

Can you try this config (but change the model name): https://github.com/princeton-nlp/ALCE/blob/main/configs/asqa_alpaca-7b_shot2_ndoc3_gtr_light_inst.yaml

yzc111 · 2024-03-12T12:18:40Z

OK. thanks~

yzc111 · 2024-03-13T06:13:54Z

another question， when I use the setting
prompt_file: prompts/asqa_light_inst.json
eval_file: data/asqa_eval_gtr_top100.json
shot: 2
ndoc: 3
dataset_name: asqa
tag: gtr_light_inst
model: vicuna-13b
temperature: 1.0
top_p: 0.95
to reproduce the result, I get the QA-EM=19.7 and mauve=70.7. the paper reports EM=31.9 mauve=82.6. are there any different settings in the config file?

howard-yen · 2024-03-16T23:49:13Z

Note that there is a difference between EM and QA-EM, and we report EM in the paper. Can you post the full output or .score file? Can you also post the link to the vicuna model that you are using? There are a couple different versions with different performances.

yzc111 · 2024-03-18T01:16:02Z

Hi. this is the config of we used to reproduce the result on vicuna-13B
prompt_file: prompts/asqa_light_inst.json
eval_file: data/asqa_eval_gtr_top100.json
shot: 2
ndoc: 3
dataset_name: asqa
tag: gtr_light_inst
model: /work/models/vicuna-13b
temperature: 1.0
top_p: 0.95

yzc111 · 2024-04-02T07:47:29Z

so, how can I get the EM score of your paper reported?

gaotianyu1350 · 2024-04-02T12:48:12Z

That is “str_em"

yzc111 · 2024-04-02T14:02:53Z

Fine，Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vicuna-13B results #24

Vicuna-13B results #24

yzc111 commented Mar 8, 2024

gaotianyu1350 commented Mar 12, 2024

yzc111 commented Mar 12, 2024

gaotianyu1350 commented Mar 12, 2024

yzc111 commented Mar 12, 2024

gaotianyu1350 commented Mar 12, 2024

yzc111 commented Mar 12, 2024

yzc111 commented Mar 13, 2024

howard-yen commented Mar 16, 2024

yzc111 commented Mar 18, 2024

yzc111 commented Apr 2, 2024

gaotianyu1350 commented Apr 2, 2024

yzc111 commented Apr 2, 2024

Vicuna-13B results #24

Vicuna-13B results #24

Comments

yzc111 commented Mar 8, 2024

gaotianyu1350 commented Mar 12, 2024

yzc111 commented Mar 12, 2024

gaotianyu1350 commented Mar 12, 2024

yzc111 commented Mar 12, 2024

gaotianyu1350 commented Mar 12, 2024

yzc111 commented Mar 12, 2024

yzc111 commented Mar 13, 2024

howard-yen commented Mar 16, 2024

yzc111 commented Mar 18, 2024

yzc111 commented Apr 2, 2024

gaotianyu1350 commented Apr 2, 2024

yzc111 commented Apr 2, 2024