Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utilize xarray to reorder dimensions as needed #550

Open
monocongo opened this issue May 30, 2024 · 0 comments
Open

Utilize xarray to reorder dimensions as needed #550

monocongo opened this issue May 30, 2024 · 0 comments

Comments

@monocongo
Copy link
Owner

The main processing script expects data variables to have (lat, lon, time) as the dimension order and currently raises an error if not. It may be possible to leverage xarray to re-orient the data in-place so this data wrangling step is handled by the script. We used to do this utilizing NCO but that approach was abandoned after so many issues for Windows users. @jamaa has suggested an elegant solution to this problem via the usage of xarray's transpose function:

          Is it not possible to reorder dimensions as needed using xarray? The following to me looks like it produces what is needed (using the example dataset from above):
ds = xr.open_dataset("alina_precipitation_data.nc")
print(ds)
<xarray.Dataset> Size: 134MB
Dimensions:  (time: 360, lat: 241, lon: 193)
Coordinates:
  * time     (time) datetime64[ns] 3kB 1994-01-01 1994-02-01 ... 2023-12-01
  * lat      (lat) float64 2kB 13.98 13.94 13.9 13.85 ... 4.063 4.021 3.979
  * lon      (lon) float64 2kB -5.979 -5.938 -5.896 -5.854 ... 1.938 1.979 2.021
Data variables:
    ppt      (time, lat, lon) float64 134MB ...
# maybe only one of the below lines is even enough?
ds2 = ds.transpose("lat", "lon", "time") # transpose the data variable's dimensions
ds2 = ds2[["lat", "lon", "time", "ppt"]] # select data in desired dimension order
print(ds2)
<xarray.Dataset> Size: 134MB
Dimensions:  (lat: 241, lon: 193, time: 360)
Coordinates:
  * lat      (lat) float64 2kB 13.98 13.94 13.9 13.85 ... 4.063 4.021 3.979
  * lon      (lon) float64 2kB -5.979 -5.938 -5.896 -5.854 ... 1.938 1.979 2.021
  * time     (time) datetime64[ns] 3kB 1994-01-01 1994-02-01 ... 2023-12-01
Data variables:
    ppt      (lat, lon, time) float64 134MB ...

If I'm not overlooking something, this could be easily built-in to climate_indices to allow for different dimension orders in the input?

Originally posted by @jamaa in #548 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant