Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Example of converting 2D coordinates to regular grid #209

Open
snowman2 opened this issue Jan 19, 2021 · 11 comments
Open

DOC: Example of converting 2D coordinates to regular grid #209

snowman2 opened this issue Jan 19, 2021 · 11 comments
Labels
documentation Documentation related issue proposal Idea for a new feature.

Comments

@snowman2
Copy link
Member

Your data has unevenly spaced 2D lat/lon values.

That means that either:

  1. Your data is not geographic and was re-projected to lat/lon in the 2D space to preserve the coordinate locations.
  2. Your data is not represented in an evenly spaced grid.

If (1) is your problem, you will need to re-project your data from 2D latlon back to the projected coordinates in 1D, add the 1D x&y coordinates to your dataset. Then it should work.

If (2) is your problem, then you will need to decide on the resolution of your grid, the bounds of your area of interest, and resample your data to the new regularly spaced grid. Then it will work.

From: #47 (comment)

@snowman2
Copy link
Member Author

There are several options for the case where the data is unevently spaced and you want to convert it to a regular grid;

@snowman2
Copy link
Member Author

snowman2 commented Jan 19, 2021

geocube:
Step 1: https://gis.stackexchange.com/questions/384581/raster-to-geopandas/384691#384691

rds = xarray.open_dataset("path_to_file.nc")
df = rds.squeeze().to_dataframe().reset_index()
geometry = gpd.points_from_xy(df.x, df.y)
gdf = gpd.GeoDataFrame(df, crs=rds.rio.crs, geometry=geometry)

Step 2: https://corteva.github.io/geocube/stable/examples/rasterize_point_data.html

from geocube.api.core import make_geocube
from geocube.rasterize import rasterize_points_griddata

geo_grid = make_geocube(
    vector_data=gdf,
    resolution=(-0.1, 0.1),
    rasterize_function=rasterize_points_griddata,
)

@snowman2 snowman2 added proposal Idea for a new feature. documentation Documentation related issue labels Jan 19, 2021
@Zepy1
Copy link

Zepy1 commented Aug 26, 2021

There are several options for the case where the data is unevently spaced and you want to convert it to a regular grid;

* `pyresample`: https://pyresample.readthedocs.io/en/latest/swath.html (maybe @djhoese could assist here?)

* The `rasterize_points_` functions powered by scipy: https://github.com/corteva/geocube/blob/master/geocube/rasterize.py

* `gdal.Grid`: https://gdal.org/python/osgeo.gdal-module.html#Grid

Also the Xoak library seems promising to perform point-wise selection of irregularly spaced data : https://xoak.readthedocs.io/en/latest/examples/introduction.html

@snowman2
Copy link
Member Author

See: #202

@snowman2
Copy link
Member Author

Related #724

@j08lue
Copy link

j08lue commented Sep 4, 2024

For case 2 of the above - if you are dealing with a curvilinear lat/lon grid, you may want to check out the regridding pro tool, xESMF:

import xarray as xr
import xesmf as xe
import numpy as np
import rioxarray

ds = xr.open_dataset("curvilinear.nc")

dlon = dlat = 0.01
target_grid = {
    "lon": np.arange(ds["longitude"].min(), ds["longitude"].max() + dlon, dlon, dtype="float32"), 
    "lat": np.arange(ds["latitude"].min(), ds["latitude"].max() + dlat, dlat, "float32")
}

regridder = xe.Regridder(ds, target_grid, "bilinear", unmapped_to_nan=True)
ds_regridded = regridder(ds)

da_regridded = ds_regridded[VARIABLE_NAME]

da_regridded.rio.write_crs("epsg:4326", inplace=True)
da_regridded.rio.set_spatial_dims("lon", "lat", inplace=True)
da_regridded.rio.to_raster("rectilinear.tif")

@j08lue
Copy link

j08lue commented Sep 5, 2024

Here is an example reprojecting from 2D to 1D (regular) lat/lon grid using Rasterio 1.4 with src_geoloc_array and reconstructing a nice Xarray DataArray again:

NB: This requires Rasterio 1.4 (currently in beta: pip install rasterio==1.4b1)

import xarray as xr
import numpy as np
import rasterio
import rasterio.warp
import rioxarray
from rioxarray.rioxarray import affine_to_coords

PATH = "tas_rcp85_land-cpm_uk_2.2km_01_day_19801201-19811130.nc"
PARAMETER_NAME = "tas"
WGS84 = rasterio.crs.CRS.from_epsg(4326)

ds = xr.open_dataset(PATH)
lon2d = ds["longitude"].values
lat2d = ds["latitude"].values
source = ds[PARAMETER_NAME].isel(ensemble_member=0).values

src_height, src_width = lon2d.shape

dst_transform, dst_width, dst_height = rasterio.warp.calculate_default_transform(
    src_crs=WGS84,
    dst_crs=WGS84,
    width=src_width,
    height=src_height,
    src_geoloc_array=(lon2d, lat2d),
)

destination = np.full((len(source), dst_height, dst_width), np.nan)

data, transform = rasterio.warp.reproject(
    source,
    destination=destination,
    src_crs=WGS84,
    dst_crs=WGS84,
    dst_transform=dst_transform,
    dst_nodata=np.nan,
    src_geoloc_array=np.stack((lon2d, lat2d))
)

coords = affine_to_coords(transform, width=dst_width, height=dst_height, x_dim="x", y_dim="y")
coords.update(time=ds["time"])

filtered_attrs = ds[PARAMETER_NAME].attrs.copy()
filtered_attrs.pop("grid_mapping", None)

da = xr.DataArray(data, coords=coords, dims=("time", "y", "x"), name=PARAMETER_NAME, attrs=filtered_attrs)

As notebook here: https://gist.github.com/j08lue/e792b3c912c33e9191734af7e795b75c

This obviously makes a ton of assumptions - that your data is in float dtype, what the original 2D lat/lon fields are called.

@snowman2, you were probably looking for a solution that uses rioxarray in the first place, not only for affine_to_coords, right?

@snowman2
Copy link
Member Author

snowman2 commented Sep 5, 2024

you were probably looking for a solution that uses rioxarray in the first place

I would like to add geolocation support to rio.reproject at some point when I get time. Though, I wouldn't mind if someone beat me to it.

@j08lue
Copy link

j08lue commented Sep 5, 2024

What kind of interface do you have in mind?

How about:

rio.reproject(..., src_geoloc_array=("lon_2d", "lat_2d"))  # names of 2d coordinate variables in the Dataset
rio.reproject(..., src_geoloc_array=(lon_2d, lat_2d))  # 2d coordinate arrays
rio.reproject(..., src_geoloc_array=True)  # apply some heuristics to guess the arrays - not sure that kind of "magic" is appropriate for rioxarray

?

@snowman2
Copy link
Member Author

snowman2 commented Sep 6, 2024

names of 2d coordinate variables

I would rather pass on this idea presently.

2d coordinate arrays

rio.reproject(..., src_geoloc_array=(lon_2d, lat_2d))  # 2d coordinate arrays

Yes, this should be the simplest to implement and I would definitely like to see this work.

apply some heuristics to guess the arrays - not sure that kind of "magic" is appropriate for rioxarray

It would be great to have it automatically identify if only 2D coordinate arrays are present and use them. To do this, there would need to be code to search for 2D coordinate arrays.

@snowman2
Copy link
Member Author

snowman2 commented Nov 1, 2024

See #822

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Documentation related issue proposal Idea for a new feature.
Projects
None yet
Development

No branches or pull requests

3 participants