Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Matching algo performance + Exclude duplicate matchings #1142

Merged
merged 22 commits into from
Sep 22, 2024

Conversation

Jonasdoubleyou
Copy link
Member

@Jonasdoubleyou Jonasdoubleyou commented Sep 14, 2024

Splits the tests into small unit tests that are fast to run (npm run unit) and perf tests that might take longer (npm run perf) - npm run test runs all of them in the barrier.

Adds a performance test for the matching algorithm, which simulates a real world matching - pupils and students continously request matches, every N days the matching algorithm is run using the current pools, matched users are removed from the pool. At the end, reports how many matches are created, how many subjects and states match, and how long the pupils and students had to wait for the match.

Extends the Matching Algorithm to exclude duplicate matches (as the old implementation does). Currently implemented as a postfilter after the matching, dont think it happens often and is worth considering in the assignment itself.

https://github.com/corona-school/project-user/issues/1316

Jonas Wilms added 3 commits September 14, 2024 23:04
Run the matching algorithm every n days while simulating match requests of real users.
@realmayus realmayus temporarily deployed to backend-feat-matching-a-iu2xfb September 14, 2024 21:18 Inactive
@realmayus realmayus temporarily deployed to backend-feat-matching-a-i58zxe September 14, 2024 21:49 Inactive
@Jonasdoubleyou Jonasdoubleyou temporarily deployed to backend-feat-matching-a-i58zxe September 14, 2024 21:51 Inactive
@Jonasdoubleyou Jonasdoubleyou had a problem deploying to backend-feat-matching-a-i58zxe September 14, 2024 22:34 Failure
@Jonasdoubleyou Jonasdoubleyou temporarily deployed to backend-feat-matching-a-i58zxe September 14, 2024 22:40 Inactive
@Jonasdoubleyou Jonasdoubleyou had a problem deploying to backend-feat-matching-a-i58zxe September 14, 2024 22:45 Failure
@Jonasdoubleyou Jonasdoubleyou changed the title feat: Matching algo performance feat: Matching algo performance + Exclude duplicate matchings Sep 14, 2024
@Jonasdoubleyou Jonasdoubleyou temporarily deployed to backend-feat-matching-a-i58zxe September 14, 2024 22:54 Inactive
@Jonasdoubleyou Jonasdoubleyou temporarily deployed to backend-feat-matching-a-i58zxe September 14, 2024 23:11 Inactive
@Jonasdoubleyou Jonasdoubleyou temporarily deployed to backend-feat-matching-a-i58zxe September 14, 2024 23:43 Inactive
@Jonasdoubleyou Jonasdoubleyou temporarily deployed to backend-feat-matching-a-i58zxe September 14, 2024 23:46 Inactive
@Jonasdoubleyou Jonasdoubleyou temporarily deployed to backend-feat-matching-a-i58zxe September 14, 2024 23:52 Inactive
@Jonasdoubleyou Jonasdoubleyou temporarily deployed to backend-feat-matching-a-i58zxe September 15, 2024 00:00 Inactive
@Jonasdoubleyou Jonasdoubleyou temporarily deployed to backend-feat-matching-a-i58zxe September 15, 2024 00:06 Inactive
As can be seen from the diff, this only has a small impact on the matched subjects while improving the matching state rate from 10% to 30%.
realmayus
realmayus previously approved these changes Sep 15, 2024
Copy link
Contributor

@realmayus realmayus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

Keep pupils a bit longer in the match pool to find a better match. This increases wait time from 4 days to 17 days on average, but increases both the number of subjects and states matched. Not sure if the trade-off is worth it, but now we have a mechanism for it that can be tuned.
@Jonasdoubleyou Jonasdoubleyou temporarily deployed to backend-feat-matching-a-i58zxe September 15, 2024 13:18 Inactive
'new',
1,
{
matchCountSum: 1044,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the new algorithm is able to create more matches with the same "quality" (about the same number of matching subjects and states) I think it is worth to switch.

'old',
1000,
{
matchCountSum: 883,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's interesting that the old algorithm prioritizes the number of matching states over the number of matching subjects and the number of matches created ...

[
'new',
1000,
{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using such a fixed time series as a benchmark is not so optimal - it can easily lead to overfitting onto this time series. Some sort of shuffling of the input would be good, but on the other hand that might loose some time effects that are actually there (e.g. pupils / students signing up in bulk after marketing campaigns)

@Jonasdoubleyou Jonasdoubleyou merged commit 8bd29b0 into master Sep 22, 2024
2 checks passed
@Jonasdoubleyou Jonasdoubleyou deleted the feat/matching-algo-perf branch September 22, 2024 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants