Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable stopwords #13

Open
andrew-morrison opened this issue Feb 26, 2018 · 3 comments
Open

Enable stopwords #13

andrew-morrison opened this issue Feb 26, 2018 · 3 comments
Assignees

Comments

@andrew-morrison
Copy link
Contributor

While working on fihristorg/fihrist-mss#28 I've realized that, while stopwords files exists (for English, Arabic, and other languages) in the standard Solr installation, they haven't been enabled in the full text search query parser. So, for example, searching for History of the Afghans (without quotes) returns 19,266 hits, whereas History Afghans returns 795 (still a lot, but without quotes Solr does an implicit OR.)

Most people seriously looking for things would quickly figure out they need to use either quotes or AND, but I think this is worth implementing for more casual users.

@holfordm: This change would be across all catalogues. Do you have any objections/comments? Solr doesn't have a built-in stopwords list for Latin, but if you want we could import one, such as this.

@holfordm
Copy link
Collaborator

Sounds like a good idea. One question - would this affect a search like "History of the Afghans" (with quotes), where you were trying to match an exact phrase?

@andrew-morrison
Copy link
Contributor Author

I think, if it is configured correctly, a phrase-search like "History of the Afghans" should still find the same matches. It might find a few more than before, if there are any documents containing something like "History for the Afghans". But we'll make sure to test it.

@ahankinson
Copy link
Contributor

@andrew-morrison we need to make sure we add the 'enablePositionIncrements' setting on the stop filter field.

See:
https://stackoverflow.com/a/2683286/8760061

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants