Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are test_eval and test_llama the same data? #109

Open
Camellia-hz opened this issue Aug 9, 2024 · 6 comments
Open

Are test_eval and test_llama the same data? #109

Camellia-hz opened this issue Aug 9, 2024 · 6 comments

Comments

@Camellia-hz
Copy link

Dear Author, Hello, When I followed the data preparation in the challenge/readme documentation, I realized that test_eval.json and test_llama.json are essential the same data (derived from test.json),
and if I train my model using test_llama.json and then generate the output.json, and then ultimately evaluate it according to the documented methods (using output.json and test_eval.json), wouldn't that be equivalent to assess my model with the training set? Is my understanding wrong?

@Camellia-hz
Copy link
Author

@DevLinyan

@Camellia-hz
Copy link
Author

@ChonghaoSima

@DevLinyan
Copy link
Contributor

The files test_eval.json and test_llama.json contain the same data but in different formats. The evaluation can only be conducted using the specific format in test_eval.json.

@Camellia-hz
Copy link
Author

Thanks for your reply, if so is the evaluation valid? Because I am using test_llama.json to train my model, if I then use test_eval.json to evaluate it, what about the training set and validation set are the same?

@Camellia-hz
Copy link
Author

@DevLinyan

@ChonghaoSima
Copy link
Contributor

Not sure what you mean by "the training set and validation set are the same".

The evaluationo is valid as long as you use our provided test file and submit to our official test server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants