Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EOS token when concatenating documents in preprocessing loop #91

Open
siddk opened this issue Aug 31, 2021 · 1 comment
Open

Add EOS token when concatenating documents in preprocessing loop #91

siddk opened this issue Aug 31, 2021 · 1 comment
Assignees
Labels
enhancement New feature or request first issue Good first issue for familiarizing yourself with the codebase

Comments

@siddk
Copy link
Contributor

siddk commented Aug 31, 2021

As per #90, we currently do not add an EOS separator between documents. We should do this, to facilitate unprompted generation for the future.

In the process, we should also probably add some strict tests checking preprocessing invariants like this, amongst other.

@siddk siddk added enhancement New feature or request first issue Good first issue for familiarizing yourself with the codebase labels Aug 31, 2021
@dlwh dlwh added this to the Mistral V2 milestone Mar 10, 2022
@dlwh
Copy link
Member

dlwh commented Jun 6, 2022

i missed this issue. i propose we punt for now, but it's easy to fix if we want? cc @J38

@dlwh dlwh removed this from the Mistral V2 milestone Jul 18, 2022
@dlwh dlwh self-assigned this Jul 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request first issue Good first issue for familiarizing yourself with the codebase
Projects
None yet
Development

No branches or pull requests

2 participants