-
Notifications
You must be signed in to change notification settings - Fork 19
Description
I have been thinking if we need a common Lagrangian type data structure, like the xarray for coordinated n-dimensional dataset, to describe the large number of Lagrangian particles. These data generally involve a time series of positions and associated data along their Lagrangian tracks. Examples are the simulated Lagrangian trajectories here, GDP drifter dataset, Argo float dataset, as well as quasi-Lagrangian tropical cyclone best-track dataset and mesoscale eddy dataset.
So far as I know, pandas.dataframe is used to depict such data, with at least three columns of time, x_pos and y_pos. This is indeed efficient and clear. However, sometimes we need extra information to tie to the dataframe, such as ID, name, type, status etc. So I think we can design a common Lagrangian data structure that all these (quasi) Lagrangian data and associated dataset can be described, accessed, stored, and manipulated efficiently.
A scratch is to define a class of Particle, with ID, name, and records as its fields. Its records is a pandas.DataFrame that stores the Lagrangian data. Through overwritting some of the operators of Particle, we can feature a simple use of Particle like pandas.DataFrame. Through extends, we can further define Drifter, Float, TropicalCyclone subclasses to become more appropriate for each case.
Do you guys have any comment on this?