Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented custom SQL DB registration #917

Merged
merged 9 commits into from
Nov 6, 2024
Merged

Conversation

zaleslaw
Copy link
Collaborator

@zaleslaw zaleslaw commented Oct 10, 2024

This pull request introduces several enhancements to the dataframe-jdbc module, primarily focusing on adding support for custom database types (DbType) in various methods. Additionally, it includes a minor change to make the H2 class extensible.

Also, this PR in the client application is based on 0.15-dev built on this branch

Enhancements to database type support:

  • dataframe-jdbc/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/readJdbc.kt:
    • Added optional dbType parameter to multiple methods, allowing custom database type objects to be passed. This parameter defaults to null and, if not provided, the database type is determined from the connection.

Minor changes:

  • dataframe-jdbc/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/db/H2.kt:
    • Changed H2 class to be open, allowing it to be extended.

For finalization, it requires

This commit introduces an optional `dbType` parameter to various JDBC read functions. It allows for specifying the database type directly, enhancing flexibility and control over query execution. This change ensures backward compatibility by defaulting to type inference when `dbType` is not provided.
This update introduces the ability to specify custom database types in the data schema generation process. The changes include the addition of a `dbTypeClassName` field and modifications to pass `DbType` to relevant functions. This enhances flexibility in handling various database types beyond the default configurations.
This update refactors the DataSchemaGenerator to use ServiceLoader for loading DbTypeProvider and removes hard-coded class loading. It also changes the annotation parameter from dbTypeClassName to dbTypeKClass for better type safety.
Copy link
Contributor

Generated sources will be updated after merging this PR.
Please inspect the changes in here.

@Jolanrensen Jolanrensen marked this pull request as ready for review October 14, 2024 11:02
@@ -142,12 +145,13 @@ public fun DataFrame.Companion.readSqlTable(
tableName: String,
limit: Int = DEFAULT_LIMIT,
inferNullability: Boolean = true,
dbType: DbType? = null,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While having a dbType argument here is great for building and debugging support for new DbTypes, I do feel like it's not "complete custom support" of new datatypes.

To be fully extensible, like with DataFrame.read(), I think we should support having service-registered DbTypes. Then a user should be able to do:

implementation("org.jetbrains.kotlinx.dataframe...")
implementation("com.custom.dbtype-for-df")

and DataFrame.readSqlTable(MyCustomDbConnection) should work, recognizing the custom DBType that's on the classpath.

Still, it's just an idea and having this dbType argument here is still a must for when the automatic dbtype detection fails.

@Jolanrensen Jolanrensen marked this pull request as draft October 14, 2024 11:02
@zaleslaw zaleslaw marked this pull request as ready for review November 6, 2024 15:54
@zaleslaw zaleslaw merged commit 65ce2ea into master Nov 6, 2024
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants