Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Algorithm taking way too long when precision is disproportionately small #101

Open
NMC92 opened this issue Feb 16, 2023 · 12 comments
Open
Labels

Comments

@NMC92
Copy link

NMC92 commented Feb 16, 2023

The algorithm either takes way too long to calculate the label of rectangular shapes or it runs out of memory while doing so.
I am using Polylabel to calculate the POI of GeoJSON features, from Openstreetmap.
Examples of such shapes include:

https://www.openstreetmap.org/relation/2730170
https://www.openstreetmap.org/relation/153548
https://www.openstreetmap.org/relation/12863380
https://www.openstreetmap.org/relation/7491995
https://www.openstreetmap.org/relation/7585771
https://www.openstreetmap.org/relation/7717839

The precision I'm using is 0.000001, since many times the shape is minuscule which requires high precision.
I also changed the code to change the precision according to the size of the shape(parseInt(inputFeature.properties.admin_level) <= 10 ? 0.0000475 : precision), but the fact there is an OOM error indicates a memory leak, which should be fixed.
I understand it taking longer due to the high precision, but when it runs out of memory to calculate the center of a square shape then it doesn't make sense.

The command I'm using is:

geojson-polygon-labels --precision=0.000001 --label=polylabel --style=largest example.geojson > labels.geojson

image

@mourner
Copy link
Member

mourner commented Feb 16, 2023

Can you make a more isolated example to debug this, e.g. a small JS snippet including the data that produces an OOM in Node?

@mourner mourner added the bug label Feb 16, 2023
@NMC92
Copy link
Author

NMC92 commented Feb 16, 2023

Of course! Let me just prepare the example for you...

@NMC92
Copy link
Author

NMC92 commented Feb 16, 2023

The link below has the video demonstrating the OOM error, a README.md, the .js file used and the .geojson file used as the data in the video.
I also tried getting the isolated labels of the problematic polygons but I'm afraid it either works for the entirety of the file or it doesn't, since for indexing reasons the .geojson file can't be split.

https://mega.nz/file/QYETkSCQ#COvB7fj5XzjyQFB9MQIk4YlvjgvLWrYej4kEcoTqFxw

@NMC92
Copy link
Author

NMC92 commented Feb 16, 2023

Should you be curious as to why I'm calculating the POI of Openstreetmap elements, I believe the POI calculation would be a safe way to figure out which elements contain which, as I've noticed there are many elements in Openstreetmap which lack a "@relation" Tag. One of my goals would be to mend this data inconsistency by using the POI list to build a tree-like list of hierarchical relationships, and then show the "relationship tree" I've built to the relevant community.

@mourner
Copy link
Member

mourner commented Feb 17, 2023

I also tried getting the isolated labels of the problematic polygons but I'm afraid it either works for the entirety of the file or it doesn't, since for indexing reasons the .geojson file can't be split.

Can you elaborate on this? Why can't the data be split into isolated polygons which are used as the input for polylabel? I don't have much bandwidth to debug complex apps, so it would be nice to have a minimal test case (e.g. just a single polylabel(polygon, 0.000001) that causes an OOM.

@NMC92
Copy link
Author

NMC92 commented Feb 17, 2023

I was messing something up on my side, I can isolate the polygon after all.
I just tried debugging polylabel on the isolate polygon on Chrome and on the CLI version of Polylabel and I had an OOM on both("Error code: Out of Memory" and "Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory" respectively).

image
image

For debugging you can try this:
polylabel(polygon, 0.0005);(so, twice the default precision)

The polygon is here, since it's way too much text to have here on the page.
For illustrative purposes, the polygon looks like this:
image

@mourner
Copy link
Member

mourner commented Feb 17, 2023

@NMC92 great, thanks for the clear test case! I'll investigate. What's the coordinate system you're using in the input data? 0.0005 seems likely way too small for this kind of input scale.

@NMC92
Copy link
Author

NMC92 commented Feb 17, 2023

@mourner Sure! I'm not sure this answers your question, but I'm using the "Web Mercator" projection, so I believe the coordinate system should be WGS 84/Pseudo-Mercator.

@mourner
Copy link
Member

mourner commented Feb 17, 2023

@NMC92 the projection is in Mercator meters, so on that latitude (~54 degrees), 0.0005 is a precision of 3 millimeters. 😅

Keeping this open because we probably need to put some safeguards in place here so that when a bad precision data is provided for the value, it doesn't go out of memory.

@NMC92
Copy link
Author

NMC92 commented Feb 17, 2023

I see. It is indeed extremely high..seemingly unnecessarily so, if only some places wouldn't be so close to each other, which is what justifies such precision. It'd be great to change the precision according to the size of the actual polygon, in order to avoid complications. Then again, it's merely a suggestion for the future. Thanks a lot @mourner!

@mourner mourner changed the title Algorithm taking way too long when calculating the POI of rectangular shapes(or which contain a rectangular section) Algorithm taking way too long when precision is disproportionately small Feb 25, 2023
@pbeeson
Copy link

pbeeson commented Mar 4, 2024

As pointed out (#82 (comment)) the size of the queue can be reduced by only pushing cells whose max possible distance are greater then the current best distance. For a lot of these degenerate cases, if this check is applied to the initial tiling cells as well, you won't get this error. If the centroid isn't inside the polygon, you can fall back to this degenerate case too, but with a trivial check/change to the starting best_cell, this is no longer an issue (#97 (comment))

@syonfox
Copy link

syonfox commented Jun 25, 2024

Hmmm, interesting stuff. Just adding my 2 cents: it appears that the problem may be in the desired scale precision, say one meter or so, being dependent on the parallel you are at. Because the Web Mercator projection is "stretched at the poles" in extreme latitudes, you might need less precision up north than at the equator.

What is the relationship between degree latitude and numerical precision for achieving one-meter accuracy? Does the required precision increase or decrease as we move towards higher latitudes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants