Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixup/oracle dates #810

Merged
merged 7 commits into from
Jun 17, 2024
Merged

Fixup/oracle dates #810

merged 7 commits into from
Jun 17, 2024

Conversation

detule
Copy link
Collaborator

@detule detule commented May 28, 2024

Fixes long-standing ORACLE issues with using batch sizes greater than one when writing to tables with DATE and TIMESTAMP fields.

See, for example #349, #350, #391

@detule detule requested review from simonpcouch and hadley May 28, 2024 01:30
Copy link
Collaborator

@simonpcouch simonpcouch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unable to push this around locally before the upcoming release, but these changes feel like a step forward for me!

R/driver-oracle.R Outdated Show resolved Hide resolved
tests/testthat/test-driver-oracle.R Outdated Show resolved Hide resolved
dbGetQuery(conn, query)
res <- dbGetQuery(conn, query)

res$data_type <- as.numeric(res$data_type)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to revert this column back to another type after processing it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Simon:

Good question - I think the query can/does return all NULLs for that field, which in turn I think gives a column full of NA_character_ in the data-frame.

Comment on lines 136 to 137
res[res$field.type == "DATE", c("data_type", "column_size")] <- c(91, 6)
res[grepl("TIMESTAMP", res$field.type), c("data_type", "column_size")] <- c(93, 16)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very minor quibble but I don't trust this sort of cross-column subset-assignment, so I'd prefer to see something more like:

res$data_type[res$field.type == "DATE"] <- 91
res$column_size[res$field.type == "DATE"] <- 6

I suspect that might also eliminate the need for as.numeric(res$data_type) above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Done - let me know if that's not what you had in mind.

I think the as.numeric may still be needed, but happy to update if you guys think there's a better way.

 ┃  Browse[1]> res <- dbGetQuery(conn, query)
 ┃  Browse[1]> str(res)
 ┃  'data.frame':   5 obs. of  17 variables:
 ┃   $ name                   : chr  "datetime" "date" "integer" "double" ...
 ┃   $ field.type             : chr  "TIMESTAMP(6)" "DATE" "NUMBER" "BINARY_DOUBLE" ...
 ┃   $ table_name             : chr  "test_table" "test_table" "test_table" "test_table" ...
 ┃   $ schema_name            : chr  "SA" "SA" "SA" "SA" ...
 ┃   $ catalog_name           : chr  NA NA NA NA ...
 ┃   $ data_type              : chr  NA NA NA NA ...
 ┃   $ column_size            : num  NA NA NA NA 255
 ┃   $ buffer_length          : num  11 16 40 8 255
 ┃   $ decimal_digits         : num  6 NA 0 NA NA
 ┃   $ numeric_precision_radix: chr  NA NA NA NA ...
 ┃   $ remarks                : chr  NA NA NA NA ...
 ┃   $ column_default         : chr  NA NA NA NA ...
 ┃   $ sql_data_type          : chr  NA NA NA NA ...
 ┃   $ sql_datetime_subtype   : chr  NA NA NA NA ...
 ┃   $ char_octet_length      : num  0 0 0 0 255
 ┃   $ ordinal_position       : num  1 2 3 4 5
 ┃   $ nullable               : num  1 1 1 1 1
 ┃  Browse[1]>     isDate <- res$field.type == "DATE"
 ┃      res$data_type[isDate] <- 91
 ┃      res$column_size[isDate] <- 6
 ┃      isTimestamp <- grepl("TIMESTAMP", res$field.type)
 ┃      res$data_type[isTimestamp] <- 93
 ┃      res$column_size[isTimestamp] <- 16
 ┃  Browse[1]> str(res)
 ┃  'data.frame':   5 obs. of  17 variables:
 ┃   $ name                   : chr  "datetime" "date" "integer" "double" ...
 ┃   $ field.type             : chr  "TIMESTAMP(6)" "DATE" "NUMBER" "BINARY_DOUBLE" ...
 ┃   $ table_name             : chr  "test_table" "test_table" "test_table" "test_table" ...
 ┃   $ schema_name            : chr  "SA" "SA" "SA" "SA" ...
 ┃   $ catalog_name           : chr  NA NA NA NA ...
 ┃   $ data_type              : chr  "93" "91" NA NA ...
 ┃   $ column_size            : num  16 6 NA NA 255
 ┃   $ buffer_length          : num  11 16 40 8 255
 ┃   $ decimal_digits         : num  6 NA 0 NA NA
 ┃   $ numeric_precision_radix: chr  NA NA NA NA ...
 ┃   $ remarks                : chr  NA NA NA NA ...
 ┃   $ column_default         : chr  NA NA NA NA ...
 ┃   $ sql_data_type          : chr  NA NA NA NA ...
 ┃   $ sql_datetime_subtype   : chr  NA NA NA NA ...
 ┃   $ char_octet_length      : num  0 0 0 0 255
 ┃   $ ordinal_position       : num  1 2 3 4 5
 ┃   $ nullable               : num  1 1 1 1 1

This is a bit of a function of the silly way the data_type column is formulated in the query ( always NULL ) - we can probably do something better to make sure that the column comes back as numeric, but there is also probably some value in keeping the query as close as the original SQLColumns implementation for the OEM oracle driver.

@detule detule merged commit 3b65fb2 into r-dbi:main Jun 17, 2024
16 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants