Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grdimage does not correctly plot xarray DataArrays when they don't contain the redundant longitude at 360 E #375

Closed
MarkWieczorek opened this issue Nov 8, 2019 · 7 comments · Fixed by #476
Labels
bug Something isn't working
Milestone

Comments

@MarkWieczorek
Copy link
Contributor

MarkWieczorek commented Nov 8, 2019

grdimage of pygmt does not correctly plot xarray DataArrays when they do not contain the redundant column of data at 360 E.

Using pygmt directly with a netcdf file provides the correct output, such as with a command like:

fig.grdimage('test.grd', region='g', projection='A0/0/6i')

test

However, here is the same output generated by pygmt when using an xarray in memory:

fig.grdimage(grid, region='g', projection='A0/0/6i')

longitude

@MarkWieczorek
Copy link
Contributor Author

And here is a simple script that demonstrates this problem:

# create a grid that spans 0 359 E, and -89 to 90 N.
longitude = np.arange(0, 360, 1)
latitude = np.arange(-89, 91, 1)
x = np.sin(np.deg2rad(longitude))
y = np.linspace(start=0, stop=1, num=180)
data = y[:, np.newaxis] * x

# create a DataArray, and then export this as a netcdf file
dataarray = xr.DataArray(data, coords=[('latitude', latitude,
                                       {'units': 'degrees_north'}),
                                       ('longitude', longitude,
                                       {'units': 'degrees_east'})], 
                         attrs = {'actual_range': [-1, 1]})
dataset = dataarray.to_dataset(name='dataarray')
dataset.to_netcdf('test.grd')

fig = pygmt.Figure()

# create a projected image using the DataArray in memory and the netcdf file
fig.grdimage(dataset.dataarray, region='g', projection='A0/0/6i')
fig.grdimage('test.grd', region='g', projection='A0/0/6i', X='6.2i')

test

@MarkWieczorek MarkWieczorek changed the title grdimage does not correctly plot grids when they don't contain the redundant 360 E. grdimage does not correctly plot xarray DataArrays when they don't contain the redundant longitude at 360 E Nov 9, 2019
@weiji14 weiji14 added the bug Something isn't working label Nov 10, 2019
@leouieda
Copy link
Member

This might be related to pixel registered vs grid node registered grids. GMT makes a lot of internal assumptions when reading the netCDF file that we don't when converting the xarray.DataArray.

@MarkWieczorek
Copy link
Contributor Author

MarkWieczorek commented Nov 12, 2019

If I use the -V option with the file test.grd, gmt warns me that it assumes it is gridline registered. If I define the longitude attribute actual_range = [0, 359], then I don't get this warning (in this case, the gmt code confirms that it is gridline and doesn't need to make any guesses). In either case, defining actual_range or not produces the same output.

As for pygmt, grdimage creates a virtual file using

file_context = lib.virtualfile_from_grid(grid)

And here is the code for this function

        matrix, region, inc = dataarray_to_matrix(grid)
        family = "GMT_IS_GRID|GMT_VIA_MATRIX"
        geometry = "GMT_IS_SURFACE"
        gmt_grid = self.create_data(
            family, geometry, mode="GMT_CONTAINER_ONLY", ranges=region, inc=inc
        )
        self.put_matrix(gmt_grid, matrix)
        args = (family, geometry, "GMT_IN|GMT_IS_REFERENCE", gmt_grid)
        with self.open_virtual_file(*args) as vfile:
            yield vfile

If you look at create_data(), you will see that there is an optional keyword registration. However, this keyword is NOT used in the routine. There is a variable registration_int, but this appears to be hardcoded to be GMT_GRID_NODE_REG

So, to the best of my knowledge, pygmt is assuming that all xarrays are gridline registered. If this is the case, this is a bug.

There are probably one of three ways to resolve this problem:

  • pygmt: pygmt should first explicitly allow the user to define if the data are gridline or pixel registered. If they are gridline registered, it should then check if the redundant boundary condition is provided, and if not to add an extra row to the internal data. I am not sure where the code should be modified for this, but perhaps dataarray_to_matrix would be the best.
  • pygmt: The virtualfile_from_grid should return a virtual netcdf file (which gmt can parse just fine). My assumption is that the virtual file is just an arbitrary binary datafile that gmt is having a hard time understanding how to decode.
  1. gmt: gmt's grdimage appears to have a bug when parsing pygmt's virtual_file input. If this is the case, we should open an issue there. I don't know enough about how gmt reads such generic files to debug this myself. Perhaps it is possible to specify in the vitrual file that the redundant longitude is missing.

Edit: by the way, I think that pygmt routines should have a keyword verbose=True which corresponds to -V

@MarkWieczorek
Copy link
Contributor Author

Could you let me know how gmt reads the pygmt virtual files? I looked through the gmt code a little, and I am going to guess that the file format is id=bf: GMT native, C-binary format (32-bit float). If I had to guess, this file format probably gets parsed by the function gmtio_bin_input that is in the file gmt_io.c. After confirmation, I will open an issue at gmt to see if this is something that they should fix.

Also, it appears that this doesn't affect all map projections. Here the same image as above, but in a mollweide projection:
moll

@weiji14
Copy link
Member

weiji14 commented Nov 14, 2019

Ah, the ol' gridline vs pixel based registration problem. I think you hit the nail on the head there @MarkWieczorek. I'm pretty sure xarray uses centre-based coordinates (i.e. pixel-based registration) and if pygmt defaults to gridline registration, then this is definitely a serious issue. I've hit this time and time again before (e.g. at weiji14/deepbedmap#150) and it just keeps coming back to haunt me...

Not sure if setting the default registration to pixel-based would solve the issue (since we're using xarray, it might be necessary?), but I'll try and look into it tomorrow if I have some mental bandwidth to do so (running a "FOSS4G Community Day" for PyGMT) Edit: Putting the Windows installation issue on a higher priority. If you've found a fix though, feel free to submit a Pull Request and I can review it.

@MarkWieczorek
Copy link
Contributor Author

I don't think that this is a pixel related issue. pygmt uses the routine dataarray_to_matrix to extract the data maxtrix from an xarray DataArray and then returns this unmodified as a numpy array. As far as I can tell, pygmt assumes that all DataArrays are gridline registered, and the example I provide above is for gridline registered data.

As you say, xarray used center-based coordinates, but this corresponds to gridline registration in gmt. (The coordinates are at the grid line intersections that occur at the center of the pixel, not the edges).

It is possible that gmt is misinterpreting the virtual file as pixel registered, and if this is the case, this is a bug with gmt. It is conceivable that gmt uses the ration of nlat and nlon to guess the registration, and since my grid doesn't include the redundant longitude at 360E, I could perhaps see this being a problem if gmt has to guess the registration.

@weiji14
Copy link
Member

weiji14 commented Jun 23, 2020

Hi @MarkWieczorek, sorry for not following up on this soon enough. I think we've found the issue, and it's probably because the virtualfile mechanism assumes all xarray grids as Cartesian instead of Geographic (see #476 (comment)).

We're working on allowing users to tell PyGMT whether their grid is Cartesian/Geographic, and Pixel/Gridline registered. This might take some time to implement as there's a lot of issues to iron out, and the team will be busy with GMT 6.1.0. But you can provide some feedback on #487 which decides what the recommended default will be going forward.

weiji14 added a commit that referenced this issue Jun 25, 2020
Keep the default as Cartesian, but allow for either Cartesian (c) or Geographic (g) coordinate system grid types. Test case based on #375 (comment).
@seisman seisman added this to the 0.2.x milestone Jul 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants