Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

labeled data not the same size and unlabeled one #42

Open
PirashanthR opened this issue Mar 24, 2020 · 0 comments
Open

labeled data not the same size and unlabeled one #42

PirashanthR opened this issue Mar 24, 2020 · 0 comments
Assignees
Labels
tokenization Suspected tokenization error

Comments

@PirashanthR
Copy link

Hello,

Checking at the released labeled data.
It looks like for task 2, unlabeled data does not match the size of labeled data.

For instance for the file task_2_t1_biology_0_0.deft, labeled data has 519 lines while the unlabeled one has 475 lines.

Is there a reason why we can observe this ?

Thanks

@PirashanthR PirashanthR added the tokenization Suspected tokenization error label Mar 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tokenization Suspected tokenization error
Projects
None yet
Development

No branches or pull requests

2 participants