[RBAC] Optimize `compute_object_role_permissions` as iterable with `prefetch_related` #300

AlanCoding · 2024-04-10T18:04:52Z

If you use Django debug toolbar, you can run the following to see how performant DAB RBAC is for rebuilding the entire RoleEvaluation table

from ansible_base.rbac.caching import compute_team_member_roles, compute_object_role_permissions

compute_object_role_permissions()

You find that it is limited by a particular constraint, which is that we need to some simple-looking prefetches added:

diff --git a/ansible_base/rbac/caching.py b/ansible_base/rbac/caching.py
index a90bf00..24584ab 100644
--- a/ansible_base/rbac/caching.py
+++ b/ansible_base/rbac/caching.py
@@ -168,7 +168,7 @@ def compute_object_role_permissions(object_roles=None, types_prefetch=None):
     if types_prefetch is None:
         types_prefetch = TypesPrefetch.from_database(RoleDefinition)
     if object_roles is None:
-        object_roles = ObjectRole.objects.iterator()
+        object_roles = ObjectRole.objects.prefetch_related('permission_partials', 'permission_partials_uuid', 'provides_teams__has_roles')
 
     for object_role in object_roles:
         role_to_delete, role_to_add = object_role.needed_cache_updates(types_prefetch=types_prefetch)

However, this loses .iterator() which is probably unacceptable, because this is the one big memory-intensive table involved in querysets.

There is a very good proposed solution at https://djangosnippets.org/snippets/1949/ but it is not a standard Django util.

import gc

def queryset_iterator(queryset, chunksize=1000):
    '''''
    Iterate over a Django Queryset ordered by the primary key

    This method loads a maximum of chunksize (default: 1000) rows in it's
    memory at the same time while django normally would load all rows in it's
    memory. Using the iterator() method only causes it to not preload all the
    classes.

    Note that the implementation of the iterator does not support ordered query sets.
    '''
    pk = 0
    last_pk = queryset.order_by('-pk')[0].pk
    queryset = queryset.order_by('pk')
    while pk < last_pk:
        for row in queryset.filter(pk__gt=pk)[:chunksize]:
            pk = row.pk
            yield row
        gc.collect()

This will give 1 order-of-magnitude improvement in the above scenario.

The ToS on the site provides license for the code snippets: https://djangosnippets.org/about/tos/

That you grant any third party who sees the code you post a royalty-free, non-exclusive license to copy and distribute that code and to make and distribute derivative works based on that code.

The text was updated successfully, but these errors were encountered:

AlanCoding added performance Reduction of queries, query performance, etc. app:rbac Ready To Merge ready to work Item is ready to be worked on and removed Ready To Merge labels Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RBAC] Optimize `compute_object_role_permissions` as iterable with `prefetch_related` #300

[RBAC] Optimize `compute_object_role_permissions` as iterable with `prefetch_related` #300

AlanCoding commented Apr 10, 2024

[RBAC] Optimize compute_object_role_permissions as iterable with prefetch_related #300

[RBAC] Optimize compute_object_role_permissions as iterable with prefetch_related #300

Comments

AlanCoding commented Apr 10, 2024

[RBAC] Optimize `compute_object_role_permissions` as iterable with `prefetch_related` #300

[RBAC] Optimize `compute_object_role_permissions` as iterable with `prefetch_related` #300