Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/ogc 508 replace elastic search by postgres v2 #1357

Draft
wants to merge 49 commits into
base: master
Choose a base branch
from

Conversation

Tschuppi81
Copy link
Contributor

Please fill in the commit message below and work through the checklist. You can delete parts that are not needed, e.g. the optional description, the link to a ticket or irrelevant options of the checklist.

Commit message

:

TYPE: <Feature|Bugfix>
LINK:
HINT:

Checklist

  • I have performed a self-review of my code
  • I considered adding a reviewer
  • I have added an upgrade hint such as data migration commands to be run
  • I have updated the PO files
  • I have added new modules to the docs
  • I made changes/features for both org and town6
  • I have updated the election_day API docs
  • I have tested my code thoroughly by hand
    • I have tested styling changes/features on different browsers
    • I have tested javascript changes/features on different browsers
    • I have tested database upgrades
    • I have tested sending mails
    • I have tested building the documentation
  • I have added tests for my changes/features
    • I am using freezegun for tests depending on date and times
    • I considered using browser tests für javascript changes/features
    • I have added/updated jest tests for d3rendering (election_day, swissvotes)

Copy link

linear bot commented Jun 5, 2024

Copy link

codecov bot commented Jun 6, 2024

Codecov Report

Attention: Patch coverage is 91.29412% with 37 lines in your changes missing coverage. Please review.

Project coverage is 89.46%. Comparing base (bdc57ed) to head (1b0e741).

Files with missing lines Patch % Lines
src/onegov/org/models/search.py 87.12% 17 Missing ⚠️
src/onegov/org/views/search.py 68.18% 7 Missing ⚠️
src/onegov/fsi/models/course_event.py 63.63% 4 Missing ⚠️
src/onegov/fsi/models/course_attendee.py 87.50% 2 Missing ⚠️
src/onegov/fsi/views/search.py 80.00% 2 Missing ⚠️
src/onegov/agency/models/person.py 92.85% 1 Missing ⚠️
src/onegov/event/models/occurrence.py 83.33% 1 Missing ⚠️
src/onegov/feriennet/models/activity.py 85.71% 1 Missing ⚠️
src/onegov/search/cli.py 0.00% 1 Missing ⚠️
src/onegov/ticket/model.py 92.30% 1 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
src/onegov/activity/models/attendee.py 98.66% <100.00%> (+0.03%) ⬆️
src/onegov/agency/models/agency.py 96.15% <100.00%> (+0.26%) ⬆️
src/onegov/agency/models/membership.py 97.14% <100.00%> (-2.86%) ⬇️
src/onegov/agency/views/search.py 100.00% <100.00%> (ø)
src/onegov/core/elements.py 97.27% <100.00%> (+0.02%) ⬆️
src/onegov/directory/models/directory.py 93.44% <100.00%> (+0.36%) ⬆️
src/onegov/directory/models/directory_entry.py 98.50% <100.00%> (+3.26%) ⬆️
src/onegov/event/models/event.py 95.37% <100.00%> (+0.02%) ⬆️
src/onegov/file/models/file.py 93.79% <100.00%> (ø)
src/onegov/form/models/submission.py 94.51% <100.00%> (ø)
... and 44 more

... and 4 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bdc57ed...1b0e741. Read the comment docs.

@Tschuppi81
Copy link
Contributor Author

@Daverball I am working on the tests in test_views_search.py and can't make them pass. I narrowed the issue down to the hybrid_property resp. expression of GeneralFile and News.
While debugging for News I see that es_public is True for all the three elements in the DB. Same for access and published. Once I apply the query.filter(model.es_public == True) no elements are left.
Screenshot from 2024-10-14 06-29-53
Any idea why my filter statement or my hybrid expression does not work?

@Daverball
Copy link
Member

@Daverball I am working on the tests in test_views_search.py and can't make them pass. I narrowed the issue down to the hybrid_property resp. expression of GeneralFile and News. While debugging for News I see that es_public is True for all the three elements in the DB. Same for access and published. Once I apply the query.filter(model.es_public == True) no elements are left. Screenshot from 2024-10-14 06-29-53 Any idea why my filter statement or my hybrid expression does not work?

You're relying a lot on implicit type conversions here and hoping that SQLAlchemy is doing the right thing for you. You could try .is_(True) rather than == True, which is the recommended way for boolean and none checks in SQLAlchemy anyways. Although you can probably just filter on the column itself, since a boolean column can be used as a filter so filter(model.es_public) should technically work as well.

Either way to properly debug this, you would have to look at the final SQL statement SQLAlchemy generates for you. Sometimes you will just have to manually define an @{property_name}.expression for hybrid properties to get something correct. There's already several hybrid properties that define a manual expression, like published.

@Daverball
Copy link
Member

Also as a side note query.count is quite inefficient for checking whether or not you have any results, since Postgres will have to find and count all the entries. It's usually better to do if session.query(query.exists()).scalar(): which will create an EXISTS subquery, which only has to find a single row that satisfies the predicate before returning a result.

@@ -323,7 +323,9 @@ def generic_search(self) -> list['Searchable']:
query = (query.filter(model.fts_idx.op('@@')(ts_query))
.add_columns(rank_expression))

results.extend(list(query.all()))
res = list(query.all())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query.all() already returns a list so you don't need to convert to a list.

).json
'/search/suggest?q=89').json

# postgres
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same is valid for this test. Any ideas @Daverball?

@@ -152,7 +166,7 @@ def role(self) -> str:
assert self.user is not None
return self.user.role

@property
@hybrid_property
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I converted email to a hybrid_property the test test_add_edit_external_attendee fails with TypeError: 'email' is an invalid keyword argument for CourseAttendee.
If I revert back the test passed but the search-postgres?q=martin fails due to the missing hybrid property...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very possible that setter has to be defined before the expression. Although I'm still not sure if the auto-generated __init__ for SQLAlchemy will contain settable hybrid properties, it might not.

But this whole thing seems to be heading in very much the same direction as your first attempt at implementing the indexing. Maybe it would be more robust to store the computed es properties like es_public in the database at the same time we store the index.

This way we don't have to add all these new potentially subtly incorrect hybrid properties just to use them within a SQL query and are closer to the original elastic search implementation. If we want more of these things to happen online down the road, we can always improve it later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the setter before the expression but it didn't help...

@Tschuppi81
Copy link
Contributor Author

I will try the approach of calculating and storing the properties in a separate column while indexing and use them for searching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants