Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: data type 'boolean' not understood` when updating to dask-sql 2024.5.0 and installing dask-expr 1.1.14 #1346

Open
teresama opened this issue Sep 27, 2024 · 0 comments

Comments

@teresama
Copy link

teresama commented Sep 27, 2024

When updating dask-sql to version 2024.5.0, it is required to have dask-expr installed.
In my machine I have installed:

pandas 2.2.3
dask 2024.9.0
dask-expr 1.1.14
dask_sql 2024.5.0

I am getting the error:
.venv/lib/python3.10/site-packages/dask/utils.py", line 1241, in __call__ return getattr(__obj, self.method)(*args, **kwargs) TypeError: data type 'boolean' not understood

when running:

import pandas as pd
import dask.dataframe as dd
from dask_sql import Context

data = {
    "column8": []
}
df = pd.DataFrame(data)

ddf = dd.from_pandas(df, npartitions=1)

c = Context()
c.create_table("tablename", ddf)
query = """
WITH
    sampled_table AS (
        SELECT "column8" AS "NEW_NAME"
        FROM tablename t
    ),
    table2 AS (
        SELECT "NEW_NAME" AS output1, COUNT(*) AS output2
        FROM sampled_table t
        GROUP BY "NEW_NAME"
    ),
    outputtable AS (
        SELECT *
        FROM table2 t
        WHERE output1 IS NOT NULL
    )
SELECT *
FROM outputtable"""

result = c.sql(query)
print(result.compute()) 

That code was working prior to the update (dask==2024.1.1 and dask-sql==2024.3.0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant