-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
common Lagrangian data structure #78
Comments
This is a great idea. I would start by reviewing the CF conventions on Trajectory Data. Probably just having data that all conforms to that would be a great start. Tagging @selipot, who has been thinking about this for GDP data. |
Thank you for tagging me here. I have been thinking about this meaning I wrote and submitted a proposal to the NSF EarthCube program to do just that: define a common Lagrangian data structure for the GDP and others. I am hoping to hear in the fall. You can see the extend of the metadata available for the GDP at its ERDDAP server. |
Shane do you think the CF trajectory data / metadata conventions are enough? Or is something more needed? |
That's is what (or near) is used right now by the GDP and returned by the OSM ERDDAP server. I am not using these files because I like to have markers for "data gaps" or interruption markers for what are otherwise regular interval time series. |
Glad you guys bring me these information. Hope @selipot get the funding so that we can start the python implementation. I didn't notice the CF convection but it indeed addresses many of my concerns. Also, I have some experience of using both GDP data and tropical cyclone data. I have tried to abstract the Lagrangian data model as I hope that I am in the right path and also that all these concerns can be merged together to shape the Lagrangian data model. A further thinking is that, one may want to analysis the 3D structure of a mesoscale eddy (or tropical cylone) in a translating cylindrical coordinate. I hope the Lagrangian model could simplify this kind of analysis. Specifically, given a eddy information, I could get the quasi-Lagrangian view of its 3D structure. Not sure if here is the right place to discuss this. Hope to see a repo for this. Or maybe we could start a session in Pangeo so that I could be updated regularly. |
I have been thinking if we need a common Lagrangian type data structure, like the xarray for coordinated n-dimensional dataset, to describe the large number of Lagrangian particles. These data generally involve a time series of positions and associated data along their Lagrangian tracks. Examples are the simulated Lagrangian trajectories here, GDP drifter dataset, Argo float dataset, as well as quasi-Lagrangian tropical cyclone best-track dataset and mesoscale eddy dataset.
So far as I know, pandas.dataframe is used to depict such data, with at least three columns of time, x_pos and y_pos. This is indeed efficient and clear. However, sometimes we need extra information to tie to the
dataframe
, such as ID, name, type, status etc. So I think we can design a common Lagrangian data structure that all these (quasi) Lagrangian data and associated dataset can be described, accessed, stored, and manipulated efficiently.A scratch is to define a class of
Particle
, with ID, name, and records as its fields. Its records is apandas.DataFrame
that stores the Lagrangian data. Through overwritting some of the operators ofParticle
, we can feature a simple use ofParticle
likepandas.DataFrame
. Throughextends
, we can further defineDrifter
,Float
,TropicalCyclone
subclasses to become more appropriate for each case.Do you guys have any comment on this?
The text was updated successfully, but these errors were encountered: