Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose sample on the rioxarray accessor #800

Open
martinfleis opened this issue Aug 29, 2024 · 3 comments
Open

Expose sample on the rioxarray accessor #800

martinfleis opened this issue Aug 29, 2024 · 3 comments
Labels
proposal Idea for a new feature.

Comments

@martinfleis
Copy link

I was recently comparing the performance of point sampling from a raster in xvec (xarray-contrib/xvec#81) and learned that when using xarray's sel, the sampling is about 40x slower than if I use rasterio's sample method (see the notebook in the linked issue).

I can possibly use sample from the rasterio object available within the rio._manager as list(dtm_da.rio._manager.acquire().sample(list(zip(x, y)))) but that is using a private API of rioxarray.

Would it be in scope to expose sample directly on the rio accessor? We could then consume it in xvec when dealing with rioxarray-backed DataArrays.

@martinfleis martinfleis added the proposal Idea for a new feature. label Aug 29, 2024
@snowman2
Copy link
Member

rio._manager is something that is easily lost after performing operations on the DataArray|Dataset and is only available when the DataArray|Dataset is opened with rioxarray.open_rasterio or when using engine="rasterio". Unfortunately, due to these limitations, exposing sample on the rio accessor would likely cause confusion as it would not always work as expected.

I would be interested to know if this method is any faster.

If it is, that may enable us to add rio.sample:

def sample(self, *args, **kwargs):
    try:
       return self._manager.acquire().sample(*args, **kwargs)
    except AttributeError:
        pass
    with MemoryFile() as memfile:
        self._xds.rio.to_raster(memfile.name)
        with memfile.open() as dst:
             return dst.sample(*args, **kwargs)

@martinfleis
Copy link
Author

Using the DTM from the notebook above, the trip via MemoryFile takes 36s, about 4x more than using the built-in sel method of xarray, so there's no point in doing that.

It seems that exposing rasterio's sample is not a feasible option. Thanks anyway!

@snowman2
Copy link
Member

These observations are also useful to be aware of: xarray-contrib/xvec#81 (comment)

Thanks for the proposal!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal Idea for a new feature.
Projects
None yet
Development

No branches or pull requests

2 participants