-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix is_time to avoid memory overload #397
Open
cehbrecht
wants to merge
19
commits into
master
Choose a base branch
from
fix-is-time
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 7 commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
fdfcb66
fix is_time to avoid memory overload
cehbrecht ca6704c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 32936e3
cleanup
cehbrecht bef8ae4
Merge branch 'fix-is-time' of github.com:roocs/clisops into fix-is-time
cehbrecht 4e685f7
update is_time5
cehbrecht d2f7d9a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] c5727bd
update is_time and get_coord_by_type
cehbrecht 301b6d0
fix get_main_variable issue
cehbrecht 1fd993c
update fix get_main_variable
cehbrecht 756f79f
update fix get_main_variable
cehbrecht be6adfa
Merge branch 'master' into fix-is-time
Zeitsperre 9e3ca97
Merge branch 'master' into fix-is-time
Zeitsperre 7821b9d
using latest xarray
cehbrecht f968e08
xfail regrid tests ... not working with latest xarray
cehbrecht 609ec3a
added "test-x" target: run tests in parallel
cehbrecht 6d4f682
Merge branch 'master' into fix-is-time
Zeitsperre c229898
fix get_main_variable
cehbrecht a178b14
unblock connection
Zeitsperre 3e11832
Merge branch 'master' into fix-is-time
cehbrecht File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -5,6 +5,7 @@ | |||||||||||
|
||||||||||||
import cf_xarray as cfxr # noqa | ||||||||||||
import cftime | ||||||||||||
import dask.array as da | ||||||||||||
import fsspec | ||||||||||||
import numpy as np | ||||||||||||
import xarray as xr | ||||||||||||
|
@@ -61,6 +62,13 @@ def get_coord_by_type( | |||||||||||
elif isinstance(ds, xr.Dataset): | ||||||||||||
# Not all coordinate variables are always classified as such | ||||||||||||
coord_vars = list(ds.coords) + list(ds.data_vars) | ||||||||||||
# make sure we skip the main variable! | ||||||||||||
try: | ||||||||||||
var = get_main_variable(ds) | ||||||||||||
except ValueError: | ||||||||||||
warnings.warn(f"No variable found for dataset '{ds}'.") | ||||||||||||
else: | ||||||||||||
coord_vars.remove(var) | ||||||||||||
else: | ||||||||||||
raise TypeError("Not an xarray.Dataset or xarray.DataArray.") | ||||||||||||
for coord_id in coord_vars: | ||||||||||||
|
@@ -95,11 +103,12 @@ def get_coord_by_type( | |||||||||||
|
||||||||||||
# Select coordinate with most dims (matching with main variable dims) | ||||||||||||
for coord_id in coords: | ||||||||||||
if all([dim in main_var_dims for dim in ds.coords[coord_id].dims]): | ||||||||||||
if return_further_matches: | ||||||||||||
return coord_id, [x for x in coords if x != coord_id] | ||||||||||||
else: | ||||||||||||
return coord_id | ||||||||||||
if coord_id in ds.coords: | ||||||||||||
if all([dim in main_var_dims for dim in ds.coords[coord_id].dims]): | ||||||||||||
if return_further_matches: | ||||||||||||
return coord_id, [x for x in coords if x != coord_id] | ||||||||||||
else: | ||||||||||||
return coord_id | ||||||||||||
# If the decision making fails, pass the first match | ||||||||||||
if return_further_matches: | ||||||||||||
return coords[0], coords[1:] | ||||||||||||
|
@@ -207,13 +216,38 @@ def is_level(coord): | |||||||||||
return False | ||||||||||||
|
||||||||||||
|
||||||||||||
def _is_time(coord): | ||||||||||||
""" | ||||||||||||
Check if a coordinate uses cftime datetime objects. | ||||||||||||
Handles Dask-backed arrays for lazy evaluation. | ||||||||||||
Comment on lines
+225
to
+226
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||
""" | ||||||||||||
if coord.size == 0: | ||||||||||||
return False # Empty array | ||||||||||||
|
||||||||||||
if isinstance(coord.dtype.type(), cftime.datetime): | ||||||||||||
return True | ||||||||||||
|
||||||||||||
# Safely get first element without loading entire array | ||||||||||||
first_value = coord.isel({dim: 0 for dim in coord.dims}).values | ||||||||||||
|
||||||||||||
# Compute only if it's a Dask array | ||||||||||||
if isinstance(first_value, da.Array): | ||||||||||||
first_value = first_value.compute() | ||||||||||||
|
||||||||||||
return isinstance(first_value.item(0), cftime.datetime) | ||||||||||||
|
||||||||||||
|
||||||||||||
def is_time(coord): | ||||||||||||
""" | ||||||||||||
Determines if a coordinate is time. | ||||||||||||
|
||||||||||||
:param coord: coordinate of xarray dataset e.g. coord = ds.coords[coord_id] | ||||||||||||
:return: (bool) True if the coordinate is time. | ||||||||||||
""" | ||||||||||||
if False and coord.ndim >= 2: | ||||||||||||
# skip variables with more than two dimensions: lat_bnds, lon_bnds, time_bnds, t, ... | ||||||||||||
return False | ||||||||||||
|
||||||||||||
if "time" in coord.cf.coordinates and coord.name in coord.cf.coordinates["time"]: | ||||||||||||
return True | ||||||||||||
|
||||||||||||
|
@@ -226,14 +260,11 @@ def is_time(coord): | |||||||||||
if np.issubdtype(coord.dtype, np.datetime64): | ||||||||||||
return True | ||||||||||||
|
||||||||||||
if isinstance(np.atleast_1d(coord.values)[0], cftime.datetime): | ||||||||||||
return True | ||||||||||||
|
||||||||||||
if hasattr(coord, "axis"): | ||||||||||||
if coord.axis == "T": | ||||||||||||
return True | ||||||||||||
|
||||||||||||
return False | ||||||||||||
return _is_time(coord) | ||||||||||||
|
||||||||||||
|
||||||||||||
def is_realization(coord): | ||||||||||||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sure coord_id is in ds.coords (lat_bnds is not)