Similarity Search Limited to First 10 Chunks Only #1091
zelhaddioui
started this conversation in
General
Replies: 1 comment
-
I think the cause of the issue is that the similarity search method is limiting the number of candidates considered (numCandidates) in the Elasticsearch KNN query, which prevents it from evaluating all possible documents in the index. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Description:
I am encountering an issue with the ElasticsearchVectorStore class when performing a similarity search. Specifically, when I execute a search with a topK value set to 2, it seems to only apply the search to the first 10 chunks stored in Elasticsearch, rather than considering all the chunks.
Details:
Library Version: 1.0.0
Elasticsearch Version: 8.13.3
Code Example:
List similarDocuments = vectorStore.similaritySearch(
SearchRequest.query(message).withTopK(2)
);
Issue Observed:
When executing the above code, I expect to retrieve the top 2 most similar documents from all available chunks in Elasticsearch. However, it appears that the search is only applied to the first 10 chunks stored in Elasticsearch, rather than considering all chunks.
Additional Information:
I suspect that the issue might be related to how Elasticsearch pagination is handled or a limitation in the current implementation of the similarity search method. I would appreciate any guidance or fixes to ensure that the search applies to all chunks stored in Elasticsearch.
Beta Was this translation helpful? Give feedback.
All reactions