Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Apply filters to a Hugging Face dataset to avoid repeating all varian…
…ts. (#719) The only issue for now is that `regex` is a regex while `includes` is a glob... So I use heuristics to convert from one to another. I think it's not a problem for hugging face datasets as we control the form they have. But it can be challenging to have a generic conversion. The best would be to use either regular expressions or glob patterns everywhere.
- Loading branch information