Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research vespa.io as an alternative to Solr #122

Open
DiegoPino opened this issue Aug 13, 2021 · 3 comments
Open

Research vespa.io as an alternative to Solr #122

DiegoPino opened this issue Aug 13, 2021 · 3 comments
Assignees
Labels
AI - Machine Learning Ok, I have no description for this Deployment Strategies What every vendor would love to Copy and pasta Discovery Find what is in your soul Docker Containers All about those tiny little critters Drupal9 Drupal9 is the new Drupal8 which was the new Drupal7 wich was the... enhancement New feature or request Future Service Settings Docker settings, Service Settings. What allows us to run the thing tigresses and bears Community work and Archipelago Travel
Milestone

Comments

@DiegoPino
Copy link
Member

What?

Vespa is a different Fast Search Index with AI integration. Its OSS, it scales so well and it has a very formal and stable API.
I would like to explore as project this type of new explorations. We are very comfortable with Solr but its good to not depend only on a single piece of stack and straight into Future's eyes .

What is needed.

  • Reading and testing a Stand alone deployment
  • Initial load (no code yet) of Documents using VESPA as a secondary of a running Solr (e.g cron driven update using our existing collections)
  • Build a small IIIF Search API wrapper App on to of VESPA (so we can integrate on that side of things)
  • Collaborate/help/code-ask-learn with/from the https://github.com/dbmdz team. We might as well (respectfully) have a chat with @jbaiter about his impressions) so we can slowly do some porting/parallel of their amazing and core to us Solr Highlight but also his general impressions of vespa.io.

If all works fine this far, and we as a group feel there is a benefit for the community,

  • Build a Drupal Module for vespa that, as with the Solr one, wraps the Drupal search_api. This may not even be needed (or if needed a less intense task) since we could simply override the actual Solarium Interaction with our own. But this is still kinda HUGE
  • Find ways of use the AI capabilities of vespa.io inside Archipelago

Marking this as Future Tasks but I will keep an eye for this after the next release. Thanks!

@giancarlobi @mbennett-uoe @alliomeria @dmer

@DiegoPino DiegoPino added enhancement New feature or request Docker Containers All about those tiny little critters Service Settings Docker settings, Service Settings. What allows us to run the thing Drupal9 Drupal9 is the new Drupal8 which was the new Drupal7 wich was the... Deployment Strategies What every vendor would love to Copy and pasta Future tigresses and bears Community work and Archipelago Travel Discovery Find what is in your soul AI - Machine Learning Ok, I have no description for this labels Aug 13, 2021
@DiegoPino DiegoPino added this to the 2.0.0 milestone Aug 13, 2021
@DiegoPino DiegoPino self-assigned this Aug 13, 2021
@jbaiter
Copy link

jbaiter commented Aug 13, 2021

Collaborate/help/code-ask-learn with/from the @dbmdz team. We might as well (respectfully) have a chat with @jbaiter about his impressions) so we can slowly do some porting/parallel of their amazing and core to us Solr Highlight but also his general impressions of vespa.io.

We've looked into using Vespa for doing a image similarity search, I'll refer you to @stefan-it, who did the research. I've not looked into Vespa's highlighting implementation and the general implementation yet, so I can't give you an answer if a port of the OCR highlighting stuff would be feasible.

@DiegoPino
Copy link
Member Author

@jbaiter thanks. We will have to do some internal testing, integration In the ecosystem probably will start at the IIIF Search API level (wrapper) before going deeper in our code. The Developer documentation looks promising (shame on me only spend 10 minutes reading it but looked clear) and the processors and query plugins are well document. I appreciate your comments on this.

@DiegoPino
Copy link
Member Author

Interestingly, Vespa has a special type of type (field) that has an even more interesting API. Annotations. Vespa has also structured data and maps as types (not indexable but still usable as return.. but I can see how a JSON snippet may fit there)

In specific annotations allow a "label" (e.g for a given structured piece of content e.g XML (example is HTML so. pretty close) you can "tag" parts of it and its content. Now the fun part goes, each Annotation can have also variable values (who said x,y in an OCR? Or IIIF Annotations?)

https://docs.vespa.ai/en/annotations.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI - Machine Learning Ok, I have no description for this Deployment Strategies What every vendor would love to Copy and pasta Discovery Find what is in your soul Docker Containers All about those tiny little critters Drupal9 Drupal9 is the new Drupal8 which was the new Drupal7 wich was the... enhancement New feature or request Future Service Settings Docker settings, Service Settings. What allows us to run the thing tigresses and bears Community work and Archipelago Travel
Projects
None yet
Development

No branches or pull requests

2 participants