Solve Memory Issues of QGField #50

csyhuang · 2022-08-21T14:23:07Z

A new QGField object is created at every timestamp. Figure out a more memory-efficient way to handle time-series data.

See pull request from Chris for details #45

chpolste · 2023-05-26T23:40:45Z

Maybe to clarify, what this issue is about:

Currently, when creating a QGDataset, all the QGField objects managed by the Dataset are immediately instanciated and kept for later use. This generally means there are (at least) two copies of any input/output data in memory when using the xarray interface: one in the QGField instances and one in the xarray datasets.

I would like to keep the data in xarray datasets/dataarrays as much and exclusively as possible and only temporarily make copies for the QGFields when computing. More data could then be handled with the same amout of memory I think. An issue here is that the three processing steps (interpolation, reference state, LWA/fluxes) depend on each other and use the QGField to store the intermediate results. So the objects must be kept around at least for some time in most applications.

Integrating further with dask would allow for distributing the computations onto multiple processors, enabling parallelism and allowing for even larger datasets to be processed. For this to work, the computation steps must be converted into a dask graph. I think this could be done either by using dask.delayed or maybe even better with xarray.apply_ufunc. I'm not sure though how well the OO interface can be wrapped with these, I think they are more geared towards a functional interface.

csyhuang added enhancement Proposed additional functionality to the package contributions welcome labels Aug 21, 2022

csyhuang mentioned this issue Aug 29, 2022

Release 0.7.0 #53

Closed

4 tasks

chpolste mentioned this issue May 26, 2023

Code clean-up for Release0.7.0 #68

Merged

chpolste mentioned this issue Jun 4, 2023

Xarray Interface for Release 0.7.0 #70

Merged

csyhuang mentioned this issue Jun 18, 2023

To-do items for Release 1.0 #75

Closed

5 tasks

csyhuang added the release0.7.3 label Oct 9, 2023

chpolste self-assigned this Oct 11, 2023

chpolste mentioned this issue Oct 17, 2023

QGDataset does not support data input with even number of latitude gridpoints #85

Closed

csyhuang removed the release1.0 label Jun 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solve Memory Issues of QGField #50

Solve Memory Issues of QGField #50

csyhuang commented Aug 21, 2022

chpolste commented May 26, 2023

Solve Memory Issues of QGField #50

Solve Memory Issues of QGField #50

Comments

csyhuang commented Aug 21, 2022

chpolste commented May 26, 2023