Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement "filetype" argument #11

Closed
Aariq opened this issue Mar 12, 2024 · 5 comments · Fixed by #21
Closed

Implement "filetype" argument #11

Aariq opened this issue Mar 12, 2024 · 5 comments · Fixed by #21
Assignees

Comments

@Aariq
Copy link
Collaborator

Aariq commented Mar 12, 2024

If tar_* functions are specific to packages and data types, then adding a filetype argument somewhere would be a way for users to override defaults for what kind of file targets are stored as (e.g. GeoTIFF vs netCDF). I could imagine filetype being an argument to tar_terra_rast() or an argument to a function supplied to the format argument of tar_terra_rast()

For example:
Option 1

tar_terra_rast <-
  function(name, command, pattern = NULL, filetype = c("GeoTIFF", "netCDF"), ...)

Option 2

tar_terra_rast <-
  function(name, command, pattern = NULL, format = format_terra_rast(filetype = c("GeoTIFF", "netCDF")), ...)

where format_terra_rast() returns the result of a call to tar_format()

The first option is probably preferable, unless there are other customizations that users might need to do to the format

@Aariq
Copy link
Collaborator Author

Aariq commented Mar 12, 2024

Related comment from PR #7:

I had a bit of a play and I couldn't quite come up with one, I thought you could pass the argument in and R's lexical scoping would handle it, but not quite.

I think this might be another way to implement this - it is initially more code, but the core of the function I think becomes easier to extend. Thank you to @maelle for showing me https://rlang.r-lib.org/reference/arg_match.html

write_raster_gtiff <- function(object, path) {
    function(object, path) {
        terra::writeRaster(
            x = object,
            filename = path,
            overwrite = TRUE,
            filetype = "GTiff"
        )
    }
}

write_raster_netcdf <- function(object, path) {
    function(object, path) {
        terra::writeRaster(
            x = object,
            filename = path,
            overwrite = TRUE,
            filetype = "netCDF"
        )
    }
}

create_write_fun <- function(filetype = c("GTiff", "netCDF")) {
    rlang::arg_match(filetype)
    switch(filetype,
           "GTiff" = write_raster_netcdf(filetype),
           "netCDF" = write_raster_gtiff(filetype)
    )
}

create_write_fun("GTiff")
#> function(object, path) {
#>         terra::writeRaster(
#>             x = object,
#>             filename = path,
#>             overwrite = TRUE,
#>             filetype = "netCDF"
#>         )
#>     }
#> <environment: 0x14f561368>
create_write_fun("netCDF")
#> function(object, path) {
#>         terra::writeRaster(
#>             x = object,
#>             filename = path,
#>             overwrite = TRUE,
#>             filetype = "GTiff"
#>         )
#>     }
#> <environment: 0x139751520>
create_write_fun("wat")
#> Error in `create_write_fun()`:
#> ! `filetype` must be one of "GTiff" or "netCDF", not "wat".

Created on 2024-03-11 with reprex v2.1.0

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.3 (2024-02-29)
#>  os       macOS Sonoma 14.3.1
#>  system   aarch64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Australia/Hobart
#>  date     2024-03-11
#>  pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cli           3.6.2   2023-12-11 [1] CRAN (R 4.3.1)
#>  digest        0.6.34  2024-01-11 [1] CRAN (R 4.3.1)
#>  evaluate      0.23    2023-11-01 [1] CRAN (R 4.3.1)
#>  fansi         1.0.6   2023-12-08 [1] CRAN (R 4.3.1)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.0)
#>  fs            1.6.3   2023-07-20 [1] CRAN (R 4.3.0)
#>  glue          1.7.0   2024-01-09 [1] CRAN (R 4.3.1)
#>  htmltools     0.5.7   2023-11-03 [1] CRAN (R 4.3.1)
#>  knitr         1.45    2023-10-30 [1] CRAN (R 4.3.1)
#>  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.1)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.0)
#>  pillar        1.9.0   2023-03-22 [1] CRAN (R 4.3.0)
#>  purrr         1.0.2   2023-08-10 [1] CRAN (R 4.3.0)
#>  R.cache       0.16.0  2022-07-21 [2] CRAN (R 4.3.0)
#>  R.methodsS3   1.8.2   2022-06-13 [2] CRAN (R 4.3.0)
#>  R.oo          1.26.0  2024-01-24 [2] CRAN (R 4.3.1)
#>  R.utils       2.12.3  2023-11-18 [2] CRAN (R 4.3.1)
#>  reprex        2.1.0   2024-01-11 [2] CRAN (R 4.3.1)
#>  rlang         1.1.3   2024-01-10 [1] CRAN (R 4.3.1)
#>  rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.3.1)
#>  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.0)
#>  sessioninfo   1.2.2   2021-12-06 [2] CRAN (R 4.3.0)
#>  styler        1.10.2  2023-08-29 [2] CRAN (R 4.3.0)
#>  utf8          1.2.4   2023-10-22 [1] CRAN (R 4.3.1)
#>  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.3.1)
#>  withr         3.0.0   2024-01-16 [1] CRAN (R 4.3.1)
#>  xfun          0.42    2024-02-08 [1] CRAN (R 4.3.1)
#>  yaml          2.3.8   2023-12-11 [1] CRAN (R 4.3.1)
#> 
#>  [1] /Users/nick/Library/R/arm64/4.3/library
#>  [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

Originally posted by @njtierney in #7 (comment)

@njtierney
Copy link
Owner

We don't actually need to write it out each time!

Here's a demo that shows that filetype is just not evaluated until you use the function:

write_raster_filetype <- function(filetype) {
  function(object, path) {
    cat(filetype)
    # terra::writeRaster(
    #   x = object,
    #   filename = path,
    #   overwrite = TRUE,
    #   filetype = filetype
    # )
  }
}

create_write_fun <- function(filetype) {
  rlang::arg_match0(filetype, c("GTiff", "netCDF"))
  write_raster_filetype(filetype)
}

thingy <- create_write_fun("GTiff")
thingy
#> function(object, path) {
#>     cat(filetype)
#>     # terra::writeRaster(
#>     #   x = object,
#>     #   filename = path,
#>     #   overwrite = TRUE,
#>     #   filetype = filetype
#>     # )
#>   }
#> <environment: 0x1306915c0>
# but then it prints the filetype!
thingy()
#> GTiff

Created on 2024-03-12 with reprex v2.1.0

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.3 (2024-02-29)
#>  os       macOS Sonoma 14.3.1
#>  system   aarch64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Australia/Hobart
#>  date     2024-03-12
#>  pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cli           3.6.2   2023-12-11 [1] CRAN (R 4.3.1)
#>  digest        0.6.34  2024-01-11 [1] CRAN (R 4.3.1)
#>  evaluate      0.23    2023-11-01 [1] CRAN (R 4.3.1)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.0)
#>  fs            1.6.3   2023-07-20 [1] CRAN (R 4.3.0)
#>  glue          1.7.0   2024-01-09 [1] CRAN (R 4.3.1)
#>  htmltools     0.5.7   2023-11-03 [1] CRAN (R 4.3.1)
#>  knitr         1.45    2023-10-30 [1] CRAN (R 4.3.1)
#>  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.1)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.0)
#>  purrr         1.0.2   2023-08-10 [1] CRAN (R 4.3.0)
#>  R.cache       0.16.0  2022-07-21 [2] CRAN (R 4.3.0)
#>  R.methodsS3   1.8.2   2022-06-13 [2] CRAN (R 4.3.0)
#>  R.oo          1.26.0  2024-01-24 [2] CRAN (R 4.3.1)
#>  R.utils       2.12.3  2023-11-18 [2] CRAN (R 4.3.1)
#>  reprex        2.1.0   2024-01-11 [2] CRAN (R 4.3.1)
#>  rlang         1.1.3   2024-01-10 [1] CRAN (R 4.3.1)
#>  rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.3.1)
#>  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.0)
#>  sessioninfo   1.2.2   2021-12-06 [2] CRAN (R 4.3.0)
#>  styler        1.10.2  2023-08-29 [2] CRAN (R 4.3.0)
#>  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.3.1)
#>  withr         3.0.0   2024-01-16 [1] CRAN (R 4.3.1)
#>  xfun          0.42    2024-02-08 [1] CRAN (R 4.3.1)
#>  yaml          2.3.8   2023-12-11 [1] CRAN (R 4.3.1)
#> 
#>  [1] /Users/nick/Library/R/arm64/4.3/library
#>  [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

@brownag
Copy link
Contributor

brownag commented Mar 12, 2024

I think we can generalize the above approaches a bit further so we do not have to pre-define functions for all the combinations of filetype, nor do we have to independently define the choices for filetype. I define a function create_format_terra_raster() (could just be format_terra_raster()) to do the work.

As above, when the supplied write function is evaluated by targets, the filetype or whatever object have you that is not an argument of the constructed function result will not be defined.

To work around this, we can create a function of the right form, then modify the body of that function to inject constant values for filetype or other parameters. So while we are at it we can also specify custom GDAL creation options via gdal argument to writeRaster()

Something like:

tar_terra_rast <- function(name,
                           command,
                           pattern = NULL,
                           filetype = NULL,
                           gdal = NULL,
                           ...,
                           tidy_eval = targets::tar_option_get("tidy_eval"),
                           packages = targets::tar_option_get("packages"),
                           library = targets::tar_option_get("library"),
                           repository = targets::tar_option_get("repository"),
                           iteration = targets::tar_option_get("iteration"),
                           error = targets::tar_option_get("error"),
                           memory = targets::tar_option_get("memory"),
                           garbage_collection = targets::tar_option_get("garbage_collection"),
                           deployment = targets::tar_option_get("deployment"),
                           priority = targets::tar_option_get("priority"),
                           resources = targets::tar_option_get("resources"),
                           storage = targets::tar_option_get("storage"),
                           retrieval = targets::tar_option_get("retrieval"),
                           cue = targets::tar_option_get("cue")) {

    name <- targets::tar_deparse_language(substitute(name))

    envir <- targets::tar_option_get("envir")

    command <- targets::tar_tidy_eval(
        expr = as.expression(substitute(command)),
        envir = envir,
        tidy_eval = tidy_eval
    )

    pattern <- targets::tar_tidy_eval(
        expr = as.expression(substitute(pattern)),
        envir = envir,
        tidy_eval = tidy_eval
    )

    # could pull defaults from geotargets package options
    if (is.null(filetype)) {
        filetype <- "GTiff"
    }

    targets::tar_target_raw(
        name = name,
        command = command,
        pattern = pattern,
        packages = packages,
        library = library,
        format = create_format_terra_raster(filetype = filetype, gdal = gdal, ...),
        # ...
    )
}

#' @param filetype File format expressed as GDAL driver names passed to `terra::writeRaster()`
#' @param gdal GDAL driver specific datasource creation options passed to `terra::writeRaster()`
#' @param ... Additional arguments not yet used
#' @noRd
create_format_terra_raster <- function(filetype, gdal, ...) {

    if (!requireNamespace("terra")) {
        stop("package 'terra' is required", call. = FALSE)
    }

    # get list of drivers available for writing depending on what the user's GDAL supports
    drv <- terra::gdal(drivers = TRUE)
    drv <- drv[drv$type == "raster" & grepl("write", drv$can), ]

    filetype <- match.arg(filetype, drv$name)

    if (is.null(filetype)) {
        filetype <- "GTiff"
    }

    .write_terra_raster <- function(object, path) {
        terra::writeRaster(
            object,
            path,
            filetype = NULL,
            overwrite = TRUE,
            gdal = NULL
        )
    }
    body(.write_terra_raster)[[2]][["filetype"]] <- filetype
    body(.write_terra_raster)[[2]][["gdal"]] <- gdal

    targets::tar_format(
        read = function(path) terra::rast(path),
        write = .write_terra_raster,
        marshal = function(object) terra::wrap(object),
        unmarshal = function(object) terra::unwrap(object)
    )
}

@Aariq
Copy link
Collaborator Author

Aariq commented Mar 12, 2024

That's cool. Never seen body() before, but that's what I was looking for.

@Aariq
Copy link
Collaborator Author

Aariq commented Mar 13, 2024

Implemented for tar_terra_rast() in #15 but still needs implementation in tar_terra_vect()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants