Support batch suggest in STWFSA backend #666

osma · 2023-02-03T12:16:45Z

PR #663 is going to bring support for batch suggest operations.

The STWFSA backend could benefit from implementing _suggest_batch instead of _suggest. It could process a batch of texts with parallel and/or vector operations.

osma · 2023-03-03T15:02:33Z

I tried this, the code is on the branch issue666-suggest-batch-stwfsa.

Unfortunately, the results were not very encouraging. Batched suggest done this way seems to be slower than the original. Maybe switching to a new representation for suggestion results (see #678) could help.

I also tried using the predict_proba method of stwfsapy, which returns the results as a sparse matrix. But here the problem is that stwfsapy internally uses different numeric IDs for concepts than Annif, so there would have to be an ID mapping mechanism to convert the results into something that Annif can use.

osma · 2023-03-08T07:38:45Z

I'm too lazy to make a table, but here are the main test results. I'm evaluating a YSO STWFSA English model on jyu-theses/eng-test on my 4 core laptop.

Before (master)

1 job

User time (seconds): 201.96
Elapsed (wall clock) time (h:mm:ss or m:ss): 3:23.56

4 jobs

User time (seconds): 288.02
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:19.72

After (issue666-suggest-batch-stwfsa branch)

1 job

User time (seconds): 181.12
Elapsed (wall clock) time (h:mm:ss or m:ss): 3:02.69

4 jobs

User time (seconds): 322.29
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:27.98

Summary

Evaluation was faster when using just 1 job, but slower with 4 jobs.
I didn't include memory usage but it was basically unchanged.

osma added the enhancement label Feb 3, 2023

osma added this to the Short term milestone Feb 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support batch suggest in STWFSA backend #666

Support batch suggest in STWFSA backend #666

osma commented Feb 3, 2023

osma commented Mar 3, 2023

osma commented Mar 8, 2023

Support batch suggest in STWFSA backend #666

Support batch suggest in STWFSA backend #666

Comments

osma commented Feb 3, 2023

osma commented Mar 3, 2023

osma commented Mar 8, 2023

Before (master)

1 job

4 jobs

After (issue666-suggest-batch-stwfsa branch)

1 job

4 jobs

Summary