Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: replace Elasticsearch by Meilisearch #1141

Open
wants to merge 1 commit into
base: sumac
Choose a base branch
from

Conversation

regisb
Copy link
Contributor

@regisb regisb commented Oct 17, 2024

With this change, we get rid of Elasticsearch across all of Tutor. Instead, we run Meilisearch, which is much more lightweight in terms of memory usage. Obviously, this is a (very) breaking change. Indexing commands will be run during init, such that search should work as before.

After the edx-search PR is merged and the dependency is upgraded in edx-platform, we should remove the manual RUN pip install ... command.

With this change, we get rid of Elasticsearch across all of Tutor.
Instead, we run Meilisearch, which is much more lightweight in terms of
memory usage. Obviously, this is a (very) breaking change. Indexing
commands will be run during init, such that search should work as
before.

After the edx-search PR is merged and the dependency is upgraded in
edx-platform, we should remove the manual `RUN pip install ...` command.
@@ -0,0 +1 @@
- 💥[Feature] Replace Elasticsearch by Meilisearch. Elasticsearch was both a source of complexity and high resource usage. With this change, we no longer run Elasticsearch to perform common search queries across Open edX. This includes: course discovery, courseware search and studio search. Instead, we index all these documents in a Meilisearch instance, which is much more lightweight in terms of memory consumption. (by @regisb)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this wont work out of box with discovery, right? course-discovery uses ES heavily in various places. Merging this can break certain things on tutor-discovery.

Copy link

@pomegranited pomegranited Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed -- trying to run this with tutor-discovery enabled throws an Elasticsearch connection error on ./manage.py update_index --disable-change-limit (after working around some missing variables).

We're hoping to deprecate Course Discovery someday, so maybe we can move the elasticsearch dependency and setup to tutor-discovery? Or to its own plugin?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created overhangio/tutor-discovery#88 to add ES to tutor-discovery and unblock this PR.

@@ -235,7 +235,6 @@ def upgrade_obsolete(config: Config) -> None:
for name in [
"ACTIVATE_LMS",
"ACTIVATE_CMS",
"ACTIVATE_ELASTICSEARCH",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious: Why is this not replaced with ACTIVATE_MEILISEARCH?

@@ -1,10 +1,10 @@
from __future__ import annotations

from copy import deepcopy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this moved above? It was correct before in terms of import order.

@@ -37,6 +37,10 @@
"LOCATION": "staticfiles_lms",
}

# Enable search features
FEATURES["ENABLE_COURSE_DISCOVERY"] = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is course-discovery enabled by default in here?

Copy link

@pomegranited pomegranited Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed -- I think we only need these feature flags, not ENABLE_COURSE_DISCOVERY (ref):

  • ENABLE_COURSEWARE_SEARCH (enabled below)
  • ENABLE_COURSEWARE_INDEX (enabled in CMS)

@@ -32,19 +32,18 @@ services:
ports:
- "8001:8000"

meilisearch:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not behind an IF condition?

Copy link

@pomegranited pomegranited left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@regisb This is looking great -- just a couple of minor issues when I used this in my tutor dev env, along with these open PRs:

Am deploying a sandbox to test the k8s setup, will report back with any issues there.

@@ -78,4 +78,4 @@ spec:
requests:
storage: 1Gi

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original meilisearch plugin requests 5Gi storage -- is it ok to use that here too?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need a caddy alias added, I think in tutor/templates/local/docker-compose.prod.yml? cf https://github.com/open-craft/tutor-contrib-meilisearch/blob/main/tutor_meili/patches/local-docker-compose-caddy-aliases#L1

Comment on lines +21 to +22
./manage.py cms reindex_studio --experimental
./manage.py cms reindex_course --active

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should gate these too (though not sure about which variable to use to do this?)

Suggested change
./manage.py cms reindex_studio --experimental
./manage.py cms reindex_course --active
{% if ACTIVATE_MEILISEARCH %}
./manage.py cms reindex_studio --experimental
./manage.py cms reindex_course --active
{% endif %}

# Meilisearch connection parameters
MEILISEARCH_ENABLED = True
MEILISEARCH_URL = "{{ MEILISEARCH_URL }}"
MEILISEARCH_PUBLIC_URL = "{{ MEILISEARCH_PUBLIC_URL }}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable overrides the MEILISEARCH_PUBLIC_URL set above in development.py, so tutor dev meilisearch's public URL is missing the :7700 port.

So I think it should be removed from here, and explicitly set in the lms + cms production.py like it is in development.py.

Suggested change
MEILISEARCH_PUBLIC_URL = "{{ MEILISEARCH_PUBLIC_URL }}"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

3 participants