diff --git a/docs/pages/algorithms/clustering.mdx b/docs/pages/algorithms/clustering.mdx index 9f1c1866..3f4777fe 100644 --- a/docs/pages/algorithms/clustering.mdx +++ b/docs/pages/algorithms/clustering.mdx @@ -1,6 +1,8 @@ +import { Callout } from 'nextra-theme-docs' + # Clustering Algorithms Overview -Kōji offers two base clustering algorithms, the fastest, being a Rust translation of the `FastCover-PP` algorithm from [ghoshanirban](https://github.com/ghoshanirban/UnitDiskCoverAlgorithms), the other being a greedy algorithm that leverages the r-tree implentation from the [rstar](https://docs.rs/rstar/0.11.0/rstar/struct.RTree.html) crate. The latter algorithm offers different levels of complexity that vary in time, resource usage, and final performance. +Kōji offers two clustering algorithms, the fastest is a Rust translation of the `FastCover-PP` algorithm by [ghoshanirban](https://github.com/ghoshanirban/UnitDiskCoverAlgorithms), the other is a Greedy algorithm that leverages the r-tree implentation from the [rstar](https://docs.rs/rstar/0.11.0/rstar/struct.RTree.html) crate. The latter algorithm offers different levels of complexity that vary in time, resource usage, and final performance. ## Goal @@ -19,7 +21,7 @@ The points to be clustered. The algorithm to use for clustering. -- **Fastest**: Runs UDC, fastest and scales the best, but produces the worst result +- **Fastest**: UDC, fastest and scales the best, but produces the worst result - **Fast**: Basic implementation of the Greedy algorithm, produces a significantly better result than `Fastest` - **Balanced**: A great balance of speed, resource usage, and result quality - **Better**: This algorithm can use a large number of resources but produces the best result @@ -44,13 +46,13 @@ The maximum number of clusters that should be generated ### Cluster Split Level -- This groups points based on their S2 cell level before clustering them -- The groups are then ran on separate threads in order to help with parallelizing workloads -- This was more relevant with the legacy clustering algorithms that were single threaded but can still be useful in some cases +This groups points based on their S2 cell level before clustering them. The groups are then ran on separate threads in order to help with parallelizing workloads. + +_This was more relevant with the legacy clustering algorithms that were single threaded but can still be useful in some cases_ ## Metrics -In order to compare the performance of multiple algorithms we calculate the `mygod_score`, aptly named after a colleague who came up with it. This score is calculated by multiplying the number of clusters by the `min_points` input added to the total `input_points` - how many points were covered by the algorithm . The lower the score, the better the result. +In order to compare the performance of multiple algorithms we calculate the `mygod_score`, aptly named after a colleague who came up with it. This score is calculated by multiplying the number of clusters by the `min_points` input added to the difference of the total `input_points` and the number of points that were covered by the algorithm. This incentives the algorithms to cover the maximum number of points possible but only if the clusters cover a number of unique points that is greater than or equal to the set `min_points`. The lower the score, the better the result. ```rust pub fn get_score(&self, min_points: usize) -> usize { @@ -64,17 +66,21 @@ Runtime is not a factor in the `mygod_score` because it is not the primary goal ## Unit Disc Cover (UDC) -The UDC algorithm is much less specialized compared to Kōji's Greedy algorithm. While it is much faster, it does not produce a desirable result. It often leads to a lot of overlap between clusters and does not take into account the density of points in a given area. This algorithm is best used when the user is looking for a quick result and does not care about the quality of the result. Since it is less specialized for the task at hand, the inputs are mostly taken into account during pre and post processing functions. On its own, this algorithm does not take into account the spherical nature of the Earth, so as part of the pre-process function, the input points must be projected onto a flat plane that takes the input radius into account. The post-process function then takes the projected points and converts them back to their original spherical coordinates. You can read more about UDC [here](https://en.wikipedia.org/wiki/Disk_covering_problem). +[Source](https://github.com/TurtIeSocks/Koji/blob/main/server/algorithms/src/clustering/fastest.rs) + +The UDC algorithm is mostly a classic implementation of the UDC problem and is not particularly written for Kōji. While it is much faster, it does not produce a desirable result. It often leads to a lot of overlap between clusters and does not take into account the density of points in a given area. This algorithm is best used when the user is looking for a quick result and does not care about the quality of the result. Since it is less specialized for the task at hand, the inputs are mostly taken into account during pre and post processing functions. On its own, this algorithm does not take into account the spherical nature of the Earth, so as part of the pre-process function, the input points must be projected onto a flat plane that takes the input radius into account. The post-process function then takes the projected points and converts them back to their original spherical coordinates. You can read more about UDC [here](https://en.wikipedia.org/wiki/Disk_covering_problem). ## Greedy -The Greedy algorithm on the other hand is much more specialized for the task at hand. Compared to its predecessors, it leverages an r-tree algorithm that allows this algorithm to run faster than the O(n^2) time complexity that the legacy algorithms ran at. This algorithm has three different levels of complexity that user's can select from. `Fast`, `Balanced`, and `Better`. The only difference between these options is the number of potential clusters that are generated based on the input points. +[Source](https://github.com/TurtIeSocks/Koji/blob/main/server/algorithms/src/clustering/greedy.rs) + +The Greedy algorithm on the other hand is much more specialized for the task at hand. Compared to its predecessors, it leverages an r-tree algorithm that allows it to run faster than the O(n^2) time complexity that the legacy algorithms could not achieve. This algorithm has three different levels of complexity that user's can select from. `Fast`, `Balanced`, and `Better`. The only difference between these options is the number of potential clusters that are generated based on the input points. ## Step 1 - Generate Potential Clusters [Source](https://github.com/TurtIeSocks/Koji/blob/c69cc3c59481fe0f259ec7079a735af71c886c4c/server/algorithms/src/clustering/greedy.rs#L108-L175) -This step is where the algorithm complexity input is utilized. +Here is where the algorithm complexity input is utilized. ```rust let potential_clusters: Vec = match cluster_mode { @@ -87,15 +93,15 @@ let potential_clusters: Vec = match cluster_mode { } ``` -- `Fast`: Generates possible clusters at the provided points and at eight equal segments between a given point and its neighbors that are within the given radius. -- `Balanced`: Similar logic to `Fast` but generates 8 equal segments between the given point and its neighbors that are within the radius \* 2. Additionally, it adds some "wiggle" points at each of the segments. +- `Fast`: Generates possible clusters at the provided points and at 8 equal segments between a given point and its neighbors that are within the given radius. +- `Balanced`: Similar logic to `Fast` but generates the same 8 equal segments between the given point and its neighbors that are within the radius \* 2. Additionally, it adds some "wiggle" points at each of the segments. - `Better`: This option leverages the [Rust S2 library](https://github.com/yjh0502/rust-s2) and generates a potential cluster at every level 22 S2 cell that is within the BBOX of the input points. This is the most resource intensive option but produces the best result. This process has been optimized by starting at level 16 and continueing to split cells down to level 22 in parallel using [Rayon](https://github.com/rayon-rs/rayon). ## Step 2 - Associate Potential Clusters [Source](https://github.com/TurtIeSocks/Koji/blob/c69cc3c59481fe0f259ec7079a735af71c886c4c/server/algorithms/src/clustering/greedy.rs#L177-L257) -Now that we have our potential clusters, we want to associate them with the input points. This is where the primary use of the r-tree is implemented and has saved us the most time compared to the legacy algorithms. Even though this step utilizes Rayon to parallelize the workload, it is still the most expensive part of the algorithm and has the most room for improvement. The r-tree is used to find all points that are within the radius of a given potential cluster. This is done in parallel for all potential clusters. We also had to check to see if the cluster exists in the r-tree, in the case that the best cluster is indeed a point itself. Lastly, if the cluster has less points than the given `min_points` input, we remove this cluster from the list of potential clusters, allowing us to release as much memory as possible and save some time in the clustering step. +Now that we have our potential clusters, we want to associate them with the input points. This is where the primary use of the r-tree is implemented and has saved us the most time compared to the legacy algorithms. Even though this step utilizes Rayon to parallelize the workload, it is still the most expensive part of the algorithm and has the most room for improvement. The r-tree is used to find all points that are within the radius of a given potential cluster. This is done in parallel for all potential clusters. We also have to check to see if the cluster exists in the r-tree, in the case that the best cluster is indeed a point itself. If the cluster has less points than the given `min_points` input, we remove this cluster from the list of potential clusters, allowing us to release as much memory as possible and save some time in the clustering step. ```rust let clusters_with_data: Vec = potential_clusters @@ -134,7 +140,8 @@ let max = clusters_with_data // For the users of Kōji there will likely never be anything higher than 100 points in a cluster .unwrap_or(100); -// Creates a new Vec, filled with Vecs with the size set to the max number of points that a cluster covers, plus one, because we want to skip the 0 index for easier indexing +// Creates a new Vec, filled with Vecs with the size set to the max number of points that a cluster covers +// plus one, because we want to skip the 0 index for convenience let mut clustered_clusters = vec![vec![]; max + 1]; for cluster in clusters_with_data.into_iter() { clustered_clusters[cluster.all.len()].push(cluster); @@ -143,9 +150,9 @@ for cluster in clusters_with_data.into_iter() { ## Step 4 - Clustering -[source code](https://github.com/TurtIeSocks/Koji/blob/c69cc3c59481fe0f259ec7079a735af71c886c4c/server/algorithms/src/clustering/greedy.rs#L277-L406) +[Source](https://github.com/TurtIeSocks/Koji/blob/c69cc3c59481fe0f259ec7079a735af71c886c4c/server/algorithms/src/clustering/greedy.rs#L277-L406) -This section is broken down into sub steps and will only contain summarized code snippets to avoid copying and pasting the entire function but if you're interested I would encourage you to check out the source above. +This section is broken down into sub steps and will only contain summarized code snippets to avoid copying and pasting the entire function, but if you're interested I would encourage you to check out the source above. ### Step 4a - Setup @@ -154,9 +161,9 @@ This section is broken down into sub steps and will only contain summarized code We start by setting up our mutable variables that help us keep track of where things are during the `while` loop runs. - `new_clusters`: A HashSet that will hold our picked clusters. The `Cluster` struct implement the `Hash` trait and are hashed based on the S2 Cell ID of the center point. Despite the `Cluster` struct being a complex type, it was unnecessary to use a HashMap because we aren't accessing the data by key, only making sure we aren't inserting duplicates. -- `blocked_points`: an additional HashSet that is keeping track of points that have already been clustered. This is key because we do not want points to count towards our `min_points` input if they have already been clustered. +- `blocked_points`: an additional HashSet that is keeping track of points that have already been clustered. This is key because we do not want points associated with each cluster to count towards our `min_points` input comparisons if they have already been clustered. - `current`: a var that starts at the max number of points that a cluster covers and is decremented until it reaches our `min_points` input. -- The rest of the variables are for tracking how much time is spent on each step and for calculating our % complete in the logs. +- The rest of the variables are for tracking how much time is spent on each step and for estimating our % complete in the logs. The rest of the steps occur in the `while` loop. @@ -164,7 +171,7 @@ The rest of the steps occur in the `while` loop. [Source](https://github.com/TurtIeSocks/Koji/blob/c69cc3c59481fe0f259ec7079a735af71c886c4c/server/algorithms/src/clustering/greedy.rs#L295-L304) -This step is where we get the clusters that have the number of points that we are currently interested in. Since this algorithm is greedy, we start at the highest and work our way down to the `min_points` input. This is why we grouped our clusters in step 3. This is a very simple loop that checks whether the index of each item is greater than or equal to our `current` var. If it is, we add it to our `clusters_of_interest` Vec. The reason we want greater than or equal to is because we will be further filtering the clusters later on and just because a cluster didn't end up being viable when `current` is 42, is still may be the best when `current` is 30. +This step is where we get the clusters that have the number of points that we are currently interested in. Since this algorithm is greedy, we start at the highest and work our way down to the `min_points` input. This is why we grouped our clusters in step 3. This is a very simple loop that checks whether the index of each item is greater than or equal to our `current` var. If it is, we add it to our `clusters_of_interest` Vec. The reason we want greater than or equal to is because we will be further filtering the clusters later on. Just because a cluster didn't end up being viable when `current` is equal to 42, is still may be the best when `current` is equal to 30. ```rust for (index, clusters) in clusters_with_data.iter().enumerate() { @@ -179,7 +186,7 @@ for (index, clusters) in clusters_with_data.iter().enumerate() { [Source](https://github.com/TurtIeSocks/Koji/blob/c69cc3c59481fe0f259ec7079a735af71c886c4c/server/algorithms/src/clustering/greedy.rs#L306-L349) -This step is where we filter out clusters that are not viable. Iterating through `clusters_of_interest` in parallel, we check how many points for each one have already been clustered, if the total is less than `current`, we discard it. During this step we are ensuring two important Vecs for each cluster, all of the points that it covers and the unique number of points that it could potentially be responsible for. +Now we filter out clusters that are not viable. Iterating through `clusters_of_interest` in parallel, we check how many points for each one have already been clustered, if the total is less than `current`, we discard it. During this step we are ensuring two important Vecs for each cluster, all of the points that it covers and the unique number of points that it could potentially be responsible for. If the determined `local_clusters` is empty, we subtract one from `current` and start the next iteration. If not, we then sort the clusters in parallel by the number of points that they cover. If the total points between any two given clusters is equal, then we compare the unique number of points that they cover. This is important for the next step because we will be iterating through the clusters in serial and we want to make sure we get the best before we start discarding other clusters. @@ -187,10 +194,10 @@ If the determined `local_clusters` is empty, we subtract one from `current` and [Source](https://github.com/TurtIeSocks/Koji/blob/c69cc3c59481fe0f259ec7079a735af71c886c4c/server/algorithms/src/clustering/greedy.rs#L351-L368) -This step is where we iterate through our `local_clusters` Vec in serial and push the best cluster to our `new_clusters` HashSet. +We iterate through our `local_clusters` in serial and push the best cluster to our `new_clusters` HashSet. - At the start of the loop check to see if the number of clusters we have already saved is greater than or equal to our `max_clusters` input and immediately break the entire `while` loop if so. -- Next we check every unique point that the cluster is responsible for to see if it has already been clustered, if so, we skip it. +- Next we check every unique point that the cluster is responsible for to see if it has already been clustered, if so, we skip it. This is why sorting them before this step is important. - If the cluster passes, then we insert all of the unique points into the `blocked_points` HashSet and the cluster into our `new_clusters` HashSet. ```rust @@ -244,27 +251,25 @@ pub fn update_unique(&mut self, tree: &RTree) { [Source](https://github.com/TurtIeSocks/Koji/blob/c69cc3c59481fe0f259ec7079a735af71c886c4c/server/algorithms/src/clustering/greedy.rs#L432-L460) -If the `min_points` input is equal to one, we want to make absolutely sure that no point has been missed as this is most often used in requests that require 100% accuracy. This step is often unnecessary, rarely needs to additional clusters, and is generally considered to be a hack that bandaids an issue in the clustering process. +If the `min_points` input is equal to one, we want to make absolutely sure that no point has been missed as this is most often used in requests that require 100% accuracy. This step is often unnecessary, rarely adds additional clusters, and is generally considered to be a hack that bandaids an unknown issue in the clustering function. 1. First we reduce all points covered by our clusters into a single HashSet 1. Next if the length of the HashSet does not equal the length of the input points, we know that at least one point has been missed. 1. We then iterate through the input points, checking to see which ones exist in the HashSet, saving the ones that do not. 1. We then extend the current clusters with the missing ones. -## Step 7 - Future work - -The two biggest improvements that can be made to this algorithm are: - -- Optimize the potential cluster generating, both in terms of speed and memory usage -- Additional post processing that takes a [Genetic Algorithm](https://en.wikipedia.org/wiki/Genetic_algorithm) approach and further optimizes the clusters. - This could be accomplished by first associating clusters with each other by utilizing the r-tree nearest neighbor methods and all of their respective points. - Once the subgroups have been created, try different combination of clusters and see if the `mygod_score` can be improved at all by trying different combinations of clusters. - e.g. if you had a group of 14 points with a min_points of 3, but you have one cluster covering the middle 10, then 2 on either side of that cluster. In this situation, replacing the single centric cluster with two clusters to cover all 14 points would result in an improved `mygod_score`. - ### Balanced (Legacy) [Source](https://github.com/TurtIeSocks/Koji/tree/6802c1fabaac2942393467ea855c91f3b40ea9a8/server/algorithms/src/clustering/balanced) -This algorithm was not the first one written for Kōji but it was the first that was committed to the repo. It operated in at least O(n^2) time, was single threaded, and was total spaghetti. The core of it is based on what Kōji calls "Bootstrapping", which generates circles of a given radius inside of a given Polygon or MultiPolygon to cover the entire area, allowing for routes that pick up all Spawnpoints and Forts. Concept wise, it was very similar to how Greedy operates now and it ran decently well for creating Fort routes but did not have the logic required to work well with Spawnpoints. I attempted to take what I learned from translating the UDC algorithm and apply it to this algorithm, particularly when it came to combining clusters since, a honeycomb base allowed me to know which neighboring clusters existed, similar to UDC. However, merging clusters tends to be very tedious work and was prone to errors. +This algorithm was not the first one written for Kōji but it was the first that was committed to the repo. It operated in at least O(n^2) time, was single threaded, and was total spaghetti. The core of it is based on what Kōji calls "Bootstrapping", which generates circles of a given radius inside of a given Polygon or MultiPolygon to cover the entire area, allowing for routes that pick up all Spawnpoints and Forts. + +Concept wise, it was very similar to how Greedy operates now and it ran decently well for creating Fort routes but did not have the logic required to work well with Spawnpoints. I attempted to take what I learned from translating the UDC algorithm and apply it to this algorithm, particularly when it came to combining clusters, since a honeycomb base allowed me to predict which neighboring clusters existed, similar to UDC. However, merging clusters tends to be very tedious work and was prone to errors. -_The code has now been removed from the repo as it does not provide any benefit but the source is still viewable in the link above._ + + _The code has now been removed from the repo as it does not provide any + benefit but the source is still viewable in the link above._ + ### Brute Force (Legacy) @@ -272,7 +277,10 @@ _The code has now been removed from the repo as it does not provide any benefit Shortly before starting work on this algorithm, I had completed the integration with OR-Tools, which utilizes a distance matrix in the TSP wrapper I wrote. I attempted to apply that same logic here as a sort of lookup table for checking which points are within the given `radius` of neighboring points, and since the values weren't reliant on each other, this calculation could be parallelized with Rayon. The core clustering algorithm is very recognizable as it was the base of the Greedy algorithm. However, it was still slower than what I had hoped for and my attempt to write another merge function wasn't exactly successful. -_The code has now been removed from the repo as it does not provide any benefit but the source is still viewable in the link above._ + + _The code has now been removed from the repo as it does not provide any + benefit but the source is still viewable in the link above._ + # Result Comparisons @@ -280,9 +288,12 @@ Now the info you're actually looking for, the results! ### Notes: -- I have provided the `area` for each polygon because this does impact the `Better` results, due to how many potential clusters are generated. -- Distance stats have been excluded from each result as the points were unsorted and they are not relevant when comparing the clustering algorithms. +- Distance stats have been excluded from each result as the points were unsorted and it is not relevant for directly comparing the clustering algorithms. - All algorithms were run on a MacBook Pro M1 with 16GB of RAM and 8 cores. +- The following inputs were used: + - `min_points`: 3 + - `radius`: 70 + - `cluster_split_level`: 1 ## Small Fence @@ -465,3 +476,54 @@ Now the info you're actually looking for, the results! || [MYGOD_SCORE] 43885 || ========================================================================= ``` + +# Conclusion + +|-------|-----------|------|--------| +| Fence | Algorithm | Time | Score | +|-------|-----------|------|--------| +| Small | Fastest | 0.02 | 4286 | +| Small | Balanced (L) | 4.11 | 3657 | +| Small | Brute Force (L) | 8.23 | 3415 | +| Small | Fast | 0.41 | 3003 | +| Small | Balanced | 1.13 | 2914 | +| Small | Better | 8.18 | 2833 | +|-------|-----------|------|--------| +| Medium | Fastest | 0.10 | 29123 | +| Medium | Balanced (L) | 206.22 | 25058 | +| Medium | Brute Force (L) | 489.92 | 23527 | +| Medium | Fast | 2.42 | 20320 | +| Medium | Balanced | 8.20 | 19604 | +| Medium | Better | 41.46| 19121 | +|-------|-----------|------|--------| +| Large | Fastest | 0.20 | 67043 | +| Large | Fast | 6.48 | 46953 | +| Large | Balanced | 21.16| 45084 | +| Large | Better | 125.90| 43885 | +|-------|-----------|------|--------| + +While the different variations of the Greedy algorithm don't scale as well as the UDC algorithm, the results are definitely worth it, especially compared to both of the legacy algorithms, which start to become unweildly on anything but smaller sizes fences. + +- The `Fastest` algorithm is great for quick and dirty calculations that want to cover everything and you need the results immediately. +- The `Fast` algorithm is if you want to take advantage of the Greedy algorithm but still need the results very quickly and can't wait for `Balanced` or `Better`. +- The `Balanced` algorithm is a great balance of speed and result quality. It is the default algorithm that Kōji uses and is what I would recommend for most users. +- The `Better` algorithm is for when you want the best result possible and are willing to wait for it. It is the most resource intensive and can take a long time to run on larger fences. I would not recommend running it if your system does not have much free memory. + +# Future work + +The two biggest improvements that can be made to the Greedy algorithm are: + +## Optimize Potential Cluster Generating + +Currently this process takes up the bulk of the algorithm time, which feels unnecessary. It also is a huge resource hog that actually makes it impossible to run `Better` on areas that are too big if the machine can't handle it. + +## Additional Post processing + +Implementing a [Genetic Algorithm](https://en.wikipedia.org/wiki/Genetic_algorithm) approach that further optimizes the intial solution. + +- This could be accomplished by first associating clusters with each other by utilizing the r-tree nearest neighbor methods and all of their respective points. +- Once the subgroups have been created, try different combination of clusters and see if the `mygod_score` can be improved at all by trying different combinations of clusters. + +### Example + +If you had a group of 14 points with a min_points of 3, but you have one cluster covering the middle 10, then 2 on either side of that cluster. In this situation, replacing the single centric cluster with two clusters to cover all 14 points would result in an improved `mygod_score`.