Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregate functions are not allowed in JOIN conditions. ArrayAgg function in CTE join ON clause #94

Open
a4tarasiuk opened this issue Sep 20, 2024 · 2 comments

Comments

@a4tarasiuk
Copy link

a4tarasiuk commented Sep 20, 2024

Django 4.2
django-cte 1.3.3

postgres 12

There is an SQL query that is needed to be performed via CTE in Django:

WITH RECURSIVE "cte" AS (SELECT "a"."id",
                                ARRAY_AGG(DISTINCT "b"."id") AS "b_ids"
                         FROM "a"
                                  INNER JOIN "b" ON ("b"."id" = "a"."id")
                         GROUP BY "a".id)

SELECT "a.id",
       ARRAY_AGG(DISTINCT "b"."id") AS "b_ids",
       ARRAY_AGG(DISTINCT "cte"."id") AS "cte_ids"
FROM a
         INNER JOIN "b" ON ("b"."id" = "a"."id")
         LEFT OUTER JOIN cte 
                 ON "a".b_ids && cte.b_ids
GROUP BY "a".id

django-cte results to this query:

WITH RECURSIVE "cte" AS (SELECT "a"."id",
                                ARRAY_AGG(DISTINCT "b"."id") AS "b_ids"
                         FROM "a"
                                  INNER JOIN "b" ON ("b"."id" = "a"."id")
                         GROUP BY "a".id)

SELECT "a.id",
       ARRAY_AGG(DISTINCT "b"."id") AS "b_ids",
       ARRAY_AGG(DISTINCT "cte"."id") AS "cte_ids"
FROM a
         INNER JOIN "b" ON ("b"."id" = "a"."id")
         LEFT OUTER JOIN cte 
                 ON ARRAY_AGG(DISTINCT "b"."id") && cte.b_ids
GROUP BY "a".id

It does not use an alias b_ids for the array_agg column.

Django code:

cte = (
    With(A)
    .annotate(b_ids=ArrayAgg("b__id", distinct=True))
    .values("id")
)

a_query = A.objects.annotate(b_ids=ArrayAgg("b__id", distinct=True))

q = (
    cte.join(
        a_query, 
        b_ids__overlap=cte.col.b_ids, 
        _join_type=LOUTER,
    )
    .with_cte(cte)
    .annotate(cte_ids=ArrayAgg(cte.col.id, distinct=True))
    .values("id")
)
@a4tarasiuk a4tarasiuk changed the title Aggregate functions are not allowed in JOIN conditions. ArrayAgg function in CTE Aggregate functions are not allowed in JOIN conditions. ArrayAgg function in CTE join ON clause Sep 20, 2024
@millerdev
Copy link
Contributor

The code snippet appears to redefine q without referencing it. Was that a typo? Does it work if you reference the earlier-defined q instead of A? Aside: consider using a different name for the first query rather than re-defining q.

a_query = A.objects.annotate(b_ids=ArrayAgg("b__id", distinct=True))

q = (
    cte.join(
        a_query, 
        b_ids__overlap=...

@a4tarasiuk
Copy link
Author

@millerdev Yeah... my bad. It was a typo.

A small update.
The issue is still reproduced, but to solve my task I had to use multiple nested CTEs so I don't have this problem anymore.
When we manipulate with array_agg field on CTE level it works ok. The issue is only with the most outer query (its mapping to raw SQL).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants