Are these processed data selected by `Llama2-7b`? #36

cafeii · 2024-10-26T15:24:12Z

Hello, thanks for your nice work. While reproducing your work, I found some differences between my selected_data/mmlu/top_p0.05.jsonl and the reference processed data mmlu-chat_adam_sim_trainp0.05_seed3_p0.05.jsonl. I found that the length of my selected data is 13525, while the reference is 13533, with only 11.6% matching. I suspect this discrepancy might be due to my use of mistral-7b for data selection, but it should not result in such a significant difference. I'm wondering if the processed data was selected using Llama2-7b. If so, I will reproduce it again and try to achieve the same result.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are these processed data selected by `Llama2-7b`? #36

Are these processed data selected by `Llama2-7b`? #36

cafeii commented Oct 26, 2024

Are these processed data selected by Llama2-7b? #36

Are these processed data selected by Llama2-7b? #36

Comments

cafeii commented Oct 26, 2024

Are these processed data selected by `Llama2-7b`? #36

Are these processed data selected by `Llama2-7b`? #36