Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for timezone aware datetime objects to SQLAlchemy dialect #92

Open
chaudum opened this issue Jun 23, 2020 · 6 comments
Open

Comments

@chaudum
Copy link
Contributor

chaudum commented Jun 23, 2020

Inserting a timezone aware datetime object using SQLAlchemy fails, even though, the object could be serialized into an ISO-format string that contains a timezone offset.
https://github.com/crate/crate-python/blob/master/src/crate/client/sqlalchemy/dialect.py#L123-L125

from datetime import datetime, timezone, timedelta

dt = datetime(2020, 6, 23, 10, 0, 0, tzinfo=timezone.utc)
dt.strftime("%Y-%m-%dT%H:%M:%S.%f%z")
'2020-06-23T10:00:00.000000+0000'
# or
dt = datetime(2020, 6, 23, 12, 0, 0, tzinfo=timezone(timedelta(hours=2), name="CEST"))
dt.strftime("%Y-%m-%dT%H:%M:%S.%f%z")
'2020-06-23T12:00:00.000000+0200'
$ cr8 run-crate 4.1.5 -e CRATE_HEAP_SIZE=2G --disable-java-magic -- @crash  --hosts '{node.http_url}' -c "SELECT ('2020-06-23T10:00:00.000000+0000'::TIMESTAMP WITH TIME ZONE = '2020-06-23T12:00:00.000000+0200'::TIMESTAMP WITH TIME ZONE) AS equal"
# run-crate
===========

Skipping download, tarball alrady extracted at /home/christian/.cache/cr8/crates/crate-4.1.5
Starting Crate process
CrateDB launching:
    PID: 1812751
    Logs: /home/christian/.cache/cr8/crates/crate-4.1.5/logs/cr8-crate-run57244782.log
    Data: /tmp/tmpbpru7ibr (removed on stop)

Psql      : 127.0.0.1:5432
Http      : 127.0.0.1:4200
Transport : 127.0.0.1:4300
Cluster ready to process requests


# @crash
========

CONNECT OK
+-------+
| equal |
+-------+
| TRUE  |
+-------+
SELECT 1 row in set (0.017 sec)

Same applies for de-serializing.

It may even be nice to add support for TIMESTAMP WITH(OUT) TIME ZONE AT TIME ZONE fields.


This issue came up when I tried to insert a datetime object from the boto client response that is timezone aware, raising TimezoneUnawareException: Timezone aware datetime objects are not supported

@chaudum
Copy link
Contributor Author

chaudum commented Jun 23, 2020

With this small diff:

diff --git a/src/crate/client/sqlalchemy/dialect.py b/src/crate/client/sqlalchemy/dialect.py
index 1f51acf..31cc650 100644
--- a/src/crate/client/sqlalchemy/dialect.py
+++ b/src/crate/client/sqlalchemy/dialect.py
@@ -20,7 +20,7 @@
 # software solely pursuant to the terms of the relevant commercial agreement.
 
 import logging
-from datetime import datetime, date
+from datetime import datetime, date, timezone
 
 from sqlalchemy import types as sqltypes
 from sqlalchemy.engine import default, reflection
@@ -121,7 +121,7 @@ class DateTime(sqltypes.DateTime):
             if value is not None:
                 assert isinstance(value, datetime)
                 if value.tzinfo is not None:
-                    raise TimezoneUnawareException(DateTime.TZ_ERROR_MSG)
+                    value = value.astimezone(timezone.utc).replace(tzinfo=None)
                 return value.strftime('%Y-%m-%dT%H:%M:%S.%fZ')
             return value
         return process

The following example application would work:

from datetime import datetime, timezone

from sqlalchemy import create_engine, Column
from sqlalchemy.schema import PrimaryKeyConstraint
from sqlalchemy.ext.declarative import DeclarativeMeta, declarative_base
from sqlalchemy.orm import scoped_session, sessionmaker
from sqlalchemy.types import DateTime, Float

Session = scoped_session(sessionmaker())
Base = declarative_base(metaclass=DeclarativeMeta)


class TestModel(Base):

    __tablename__ = "t1"
    __table_args__ = (PrimaryKeyConstraint("ts"),  {"schema": "doc"})

    ts = Column("ts", DateTime)


def main():
    engine = create_engine(
        "crate://",
        connect_args={
            "username": "crate",
            "password": None,
            "servers": "localhost:4200",
        },
        strategy="threadlocal",
    )
    Session.configure(bind=engine)
    Base.metadata.bind = engine

    session = Session()
    session.execute("""
        CREATE TABLE IF NOT EXISTS doc.t1 (
          ts TIMESTAMP WITH TIME ZONE
        )
    """)
    session.add(
        TestModel(ts=datetime(2020, 6, 23, 12, 0, 0, tzinfo=timezone.utc))
    )
    session.commit()


if __name__ == "__main__":
    main()

@mfussenegger
Copy link
Member

I'm not sure if it is a good idea to discard information implicitly doing the serialization.

Converting to UTC would keep the time information itself intact, but it would lose the timezone. Depending on the users use-case this may be required information and a user might assume that TIMESTAMP WITH TIME ZONE keeps the time zone info as well, but it does not. (Which is even more confusing given that TIME WITH TIME ZONE does)

Seems to me that being strict about this is the better default.

Maybe some opt-in flag cloud be provided.

@chaudum
Copy link
Contributor Author

chaudum commented Jun 24, 2020

I'm not sure if it is a good idea to discard information implicitly doing the serialization.

dt.strftime("%Y-%m-%dT%H:%M:%S.%f%z")

This would not discard any information, but would add the TZ offset to the ISO string.

Converting to UTC would keep the time information itself intact, but it would lose the timezone. Depending on the users use-case this may be required information and a user might assume that TIMESTAMP WITH TIME ZONE keeps the time zone info as well, but it does not. (Which is even more confusing given that TIME WITH TIME ZONE does)

Agree, the comment above wasn't meant as a solution :)
The behaviour could/should be different for different column types (e.g. TIMESTAMP WITH TIME ZONE vs. TIMESTAMP WITHOUT TIME ZONE).

Seems to me that being strict about this is the better default.

Maybe some opt-in flag cloud be provided.

@mfussenegger
Copy link
Member

This would not discard any information, but would add the TZ offset to the ISO string.

You'd still loose part of the information eventually, even with TIMESTAMP WITH TIME ZONE. The "which offset was there originally" part is not persisted and you cannot get it back out unless you save it into a separate column.

@chaudum
Copy link
Contributor Author

chaudum commented Jun 25, 2020

This would not discard any information, but would add the TZ offset to the ISO string.

You'd still loose part of the information eventually, even with TIMESTAMP WITH TIME ZONE. The "which offset was there originally" part is not persisted and you cannot get it back out unless you save it into a separate column.

Got you!

But isn't it also somehow clear, that if the database stores the timestamp in UTC, that it will be converted and the offset is "lost"? Usually client applications take care that the timestamp from the server is converted into a "client local" timezone.

@amotl
Copy link
Member

amotl commented Dec 21, 2020

Hi.

Without reading about the specific details here, I just wanted to share a chance find. SQLAlchemy-Utc is a drop-in replacement of SQLAlchemy’s built-in DateTime type with timezone=True option enabled.

With kind regards,
Andreas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants