From 932dde1abdb7ce38076d85ab90803a7456882d38 Mon Sep 17 00:00:00 2001 From: Granville Barnett Date: Mon, 21 Oct 2024 16:55:15 +0100 Subject: [PATCH 1/3] Vector Collection Search: Tuning hz:query --- docs/modules/data-structures/pages/vector-search-overview.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/data-structures/pages/vector-search-overview.adoc b/docs/modules/data-structures/pages/vector-search-overview.adoc index 1ebeb9a72..6c71bfacc 100644 --- a/docs/modules/data-structures/pages/vector-search-overview.adoc +++ b/docs/modules/data-structures/pages/vector-search-overview.adoc @@ -236,7 +236,7 @@ To decrease pressure on heap memory, you can decrease the number of parallel mig 1. For searches with small `topK` (for example, 10) it may be beneficial to artificially increase `topK`, adjust `partitionLimit` accordingly, and discard extra results. If you need 10 results, a good starting point for tuning could be `topK=100` and a `partitionLimit` between 50 and 100. While this will make the search slower, it will also improve quality, sometimes significantly. Overall, this setup can be more efficient than increasing index build parameters (`max-degree`, `ef-construction`) which results in slower index builds and searches. With a very small `topK` or `paritionLimit`, the search algorithm is less able to escape local minima and find the best results. 2. Vector deduplication does not incur significant overhead for uploads (usually less than 1%) and searches. You may consider disabling it to get slightly better performance and smaller memory usage if your dataset does not contain duplicated vectors. However, be aware that in the presence of many duplicated vectors with deduplication disabled, a similarity search may return poor quality results. -3. For a given query, each vector index partition is searched by 1 thread. The number of concurrent partition searches is configured by specifying a pool size for `hz:query` executor, which by default has 16 threads per member. In configurations with many cores, it can be increased to fully utilize all available cores as follows: +3. For a given query, each vector index partition is searched by 1 thread. The number of concurrent partition searches is configured by specifying a pool size for `hz:query` executor, which by default has 16 threads per member. If optimising for search we recommend setting the `hz:query` pool size to be that of the physical core count of your host machines: this will result in a good balance between search throughput and CPU utilisation. Setting `hz:query` to have a pool size greater than that of the physical core count will not deliver a significant increase in throughput but it will increase broader CPU utilisation. The `hz:query` pool size can be changed as follows: + [tabs] ==== From 2a2244ce9f0f18bdd1194d506825d7c80c0e39ff Mon Sep 17 00:00:00 2001 From: Granville Barnett <140408555+gbarnett-hz@users.noreply.github.com> Date: Tue, 22 Oct 2024 09:13:05 +0100 Subject: [PATCH 2/3] Update docs/modules/data-structures/pages/vector-search-overview.adoc MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit broader -> total. Co-authored-by: Krzysztof Jamróz <79092062+k-jamroz@users.noreply.github.com> --- docs/modules/data-structures/pages/vector-search-overview.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/data-structures/pages/vector-search-overview.adoc b/docs/modules/data-structures/pages/vector-search-overview.adoc index 6c71bfacc..2d3f104d1 100644 --- a/docs/modules/data-structures/pages/vector-search-overview.adoc +++ b/docs/modules/data-structures/pages/vector-search-overview.adoc @@ -236,7 +236,7 @@ To decrease pressure on heap memory, you can decrease the number of parallel mig 1. For searches with small `topK` (for example, 10) it may be beneficial to artificially increase `topK`, adjust `partitionLimit` accordingly, and discard extra results. If you need 10 results, a good starting point for tuning could be `topK=100` and a `partitionLimit` between 50 and 100. While this will make the search slower, it will also improve quality, sometimes significantly. Overall, this setup can be more efficient than increasing index build parameters (`max-degree`, `ef-construction`) which results in slower index builds and searches. With a very small `topK` or `paritionLimit`, the search algorithm is less able to escape local minima and find the best results. 2. Vector deduplication does not incur significant overhead for uploads (usually less than 1%) and searches. You may consider disabling it to get slightly better performance and smaller memory usage if your dataset does not contain duplicated vectors. However, be aware that in the presence of many duplicated vectors with deduplication disabled, a similarity search may return poor quality results. -3. For a given query, each vector index partition is searched by 1 thread. The number of concurrent partition searches is configured by specifying a pool size for `hz:query` executor, which by default has 16 threads per member. If optimising for search we recommend setting the `hz:query` pool size to be that of the physical core count of your host machines: this will result in a good balance between search throughput and CPU utilisation. Setting `hz:query` to have a pool size greater than that of the physical core count will not deliver a significant increase in throughput but it will increase broader CPU utilisation. The `hz:query` pool size can be changed as follows: +3. For a given query, each vector index partition is searched by 1 thread. The number of concurrent partition searches is configured by specifying a pool size for `hz:query` executor, which by default has 16 threads per member. If optimising for search we recommend setting the `hz:query` pool size to be that of the physical core count of your host machines: this will result in a good balance between search throughput and CPU utilisation. Setting `hz:query` to have a pool size greater than that of the physical core count will not deliver a significant increase in throughput but it will increase total CPU utilisation. The `hz:query` pool size can be changed as follows: + [tabs] ==== From c6ba5f3673d55a3744c4c6028ddfa3ec623ceaa4 Mon Sep 17 00:00:00 2001 From: Granville Barnett <140408555+gbarnett-hz@users.noreply.github.com> Date: Tue, 22 Oct 2024 10:57:00 +0100 Subject: [PATCH 3/3] Update docs/modules/data-structures/pages/vector-search-overview.adoc USA spelling Co-authored-by: Amanda Lindsay --- docs/modules/data-structures/pages/vector-search-overview.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/data-structures/pages/vector-search-overview.adoc b/docs/modules/data-structures/pages/vector-search-overview.adoc index 2d3f104d1..113896fa2 100644 --- a/docs/modules/data-structures/pages/vector-search-overview.adoc +++ b/docs/modules/data-structures/pages/vector-search-overview.adoc @@ -236,7 +236,7 @@ To decrease pressure on heap memory, you can decrease the number of parallel mig 1. For searches with small `topK` (for example, 10) it may be beneficial to artificially increase `topK`, adjust `partitionLimit` accordingly, and discard extra results. If you need 10 results, a good starting point for tuning could be `topK=100` and a `partitionLimit` between 50 and 100. While this will make the search slower, it will also improve quality, sometimes significantly. Overall, this setup can be more efficient than increasing index build parameters (`max-degree`, `ef-construction`) which results in slower index builds and searches. With a very small `topK` or `paritionLimit`, the search algorithm is less able to escape local minima and find the best results. 2. Vector deduplication does not incur significant overhead for uploads (usually less than 1%) and searches. You may consider disabling it to get slightly better performance and smaller memory usage if your dataset does not contain duplicated vectors. However, be aware that in the presence of many duplicated vectors with deduplication disabled, a similarity search may return poor quality results. -3. For a given query, each vector index partition is searched by 1 thread. The number of concurrent partition searches is configured by specifying a pool size for `hz:query` executor, which by default has 16 threads per member. If optimising for search we recommend setting the `hz:query` pool size to be that of the physical core count of your host machines: this will result in a good balance between search throughput and CPU utilisation. Setting `hz:query` to have a pool size greater than that of the physical core count will not deliver a significant increase in throughput but it will increase total CPU utilisation. The `hz:query` pool size can be changed as follows: +3. For a given query, each vector index partition is searched by 1 thread. The number of concurrent partition searches is configured by specifying a pool size for `hz:query` executor, which by default has 16 threads per member. If optimizing for search, we recommend setting the `hz:query` pool size to be that of the physical core count of your host machines: this will result in a good balance between search throughput and CPU utilization. Setting `hz:query` to have a pool size greater than that of the physical core count will not deliver a significant increase in throughput but it will increase total CPU utilization. The `hz:query` pool size can be changed as follows: + [tabs] ====