-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pre-queries for geojson geoms in bounding box before cluster query re… #10663
Conversation
@chiatt we can do that, but I would strongly advise adding in some logic that embeds the state of the db into the cache key so that we avoid a case where a 404 was cached, then a new geom was created but it doesn't get included in a tile because the cache hasn't timed out yet. For example:
|
still testing a couple alternative approaches here |
…ustering unclusterable tile re #10452
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It really seems like this should offer a nice performance improvement, but when I look at request times in Silk, I'm not seeing any improvement. It might be that I'm not testing it with the right data because I'm sure the scale and distribution of points makes a difference. Also, the additional query to the edit log is probably increasing the request time by a wee bit as well.
Also, it doesn't seem consequential, but I'm getting this 500 error:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/django/core/handlers/exception.py", line 55, in inner
response = get_response(request)
File "/usr/local/lib/python3.10/dist-packages/django/core/handlers/base.py", line 197, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/usr/local/lib/python3.10/dist-packages/django/views/generic/base.py", line 104, in view
return self.dispatch(request, *args, **kwargs)
File "/web_root/arches/arches/app/views/api.py", line 107, in dispatch
return super(APIBase, self).dispatch(request, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/django/views/generic/base.py", line 143, in dispatch
return handler(request, *args, **kwargs)
File "/web_root/arches/arches/app/views/api.py", line 396, in get
tile = bytes(cursor.fetchone()[0])
TypeError: 'NoneType' object is not subscriptable
[nodeid, zoom, x, y, nodeid, resource_ids], | ||
) | ||
else: | ||
tile = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this alone will prevent a requery for the MVT because the value saved to the cache isn't None which is what is being checked for prior to the query.
arches/app/views/api.py
Outdated
@@ -365,6 +400,10 @@ def get(self, request, nodeid, zoom, x, y): | |||
raise Http404() | |||
return HttpResponse(tile, content_type="application/x-protobuf") | |||
|
|||
def create_mvt_cache_key(node, zoom, x, y, user, mvt_snapshot=None): | |||
if not mvt_snapshot: | |||
mvt_snapshot = models.EditLog.objects.filter(nodegroupid=str(node.nodegroup_id)).count() |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
If the value is different between what is saved to the cache when a user gets a 404 vs an MVT tile, then it seems ok to return no tile even if an editor has recently created a new geometry within those bounds. That's the price of relying on a cache, and the cache timeout can always be adjusted to an admin's preference for performance over accuracy. |
@chiatt the
How many geometries does the resource layer you're testing with have? I would expect little to no improvement for lower-density resource-layers but once you get into the tens of thousands I would expect noticeably faster responses for MVT requests. |
… #10452
Types of changes
Description of Change
This PR aims to do a much simpler
count_query
that, if the count is 0, skips a more complex query, caches an empty tile, and raises a 404 for the MVT.The
count_query
first counts how many geojson geoms for the requested node actually lie within the tile bounding box. If count >= themin_points
parameter for that node config, it proceeds to the clustering query which further narrows its query with the same subquery as thecount_query
to speed things up. If 0 < count <min_points
it proceeds to render the unclustered mvt geometry tile.This PR also creates a method to create a MVT cache key incorporating...
viewable_nodegroups
)...so that stale MVT/data won't override more recent MVT/data for that node/zoom/x/y/user.
Anecdotal benchmarks of this branch vs the target branch show up to a 30% speed improvement for rendering the largest clustered MVTs.
Testing This PR
tile = None
between lines 285 and 286 to temporarily disable MVT caching.Issues Solved
#10452
Checklist
Ticket Background
Further comments