Skip to content

Conversation

@VeckoTheGecko
Copy link
Contributor

@VeckoTheGecko VeckoTheGecko commented Mar 3, 2025

Deferred loading will be handled natively by xarray

Changes/commits:

  • Remove deferred_load docstrings
  • Remove deferred_load param from FieldSet
  • Remove deferred_load param from Field
  • Remove deferred_load from test suite
  • Remove Grid.defer_load
  • Remove unreachable code
  • read in files using xr.open_mfdataset
  • Fix test data dim order
  • Refactor tests to use different nc files for multifile loading
  • Create multifile_fieldset fixture
  • Remove unused Grid.computeTimeChunk()
  • Remove unused Grid._update_status
  • Remove unused DeferredArray
  • Remove unused Field.computeTimeChunk and Field._data_concatenate

I'm still having difficulty with a couple failing tests, so this is still WIP.

FAILED docs/examples/example_globcurrent.py::test_globcurrent_startparticles_between_time_arrays[True--300] - RuntimeError: P sampled outside time domain at time 2002-12-31T00:00:00.000000000. Try setting allow_time_extrapolation to True.. Error could not be handled because particle was not part of the Field Sampling.
FAILED docs/examples/example_globcurrent.py::test_globcurrent_startparticles_between_time_arrays[True-300] - RuntimeError: P sampled outside time domain at time 2002-01-01T00:00:00.000000000. Try setting allow_time_extrapolation to True.. Error could not be handled because particle was not part of the Field Sampling.
FAILED docs/examples/example_globcurrent.py::test_globcurrent_startparticles_between_time_arrays[False--300] - RuntimeError: P sampled outside time domain at time 2002-01-28T00:00:00.000000000. Try setting allow_time_extrapolation to True.. Error could not be handled because particle was not part of the Field Sampling.
FAILED docs/examples/example_globcurrent.py::test_globcurrent_startparticles_between_time_arrays[False-300] - RuntimeError: P sampled outside time domain at time 2002-01-01T00:00:00.000000000. Try setting allow_time_extrapolation to True.. Error could not be handled because particle was not part of the Field Sampling.
FAILED tests/tools/test_warnings.py::test_kernel_warnings - parcels.tools.statuscodes.FieldOutOfBoundError: Field sampled out-of-bound, at (depth=0.0, lat=0.0, lon=0.0)

parcels/field.py Outdated
Comment on lines 580 to 588
# if len(buffer_data.shape) == 2:
# data_list.append(buffer_data.reshape(sum(((len(tslice), 1), buffer_data.shape), ())))
# elif len(buffer_data.shape) == 3:
# if len(filebuffer.indices["depth"]) > 1:
# data_list.append(buffer_data.reshape(sum(((1,), buffer_data.shape), ())))
# else:
# data_list.append(buffer_data.reshape(sum(((len(tslice), 1), buffer_data.shape[1:]), ())))
# else:
# data_list.append(buffer_data)
Copy link
Contributor Author

@VeckoTheGecko VeckoTheGecko Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self: Still need to work through this commented code so there's no unintended side-effects

@VeckoTheGecko
Copy link
Contributor Author

Working off of v4-dev, adding the following line to the test_globcurrent_startparticles_between_time_arrays test case causes it to fail with the following error message. It occurs on the P field, so I think its due to the adding of fields in eager mode not working as intended. Any ideas @erikvansebille ? Should get a more minimal example? I'm slightly hesitant to outright disable this since I'm still not 100% sure on what's happening

@pytest.mark.parametrize("dt", [-300, 300])
@pytest.mark.parametrize("with_starttime", [True, False])
def test_globcurrent_startparticles_between_time_arrays(dt, with_starttime):
    fieldset = set_globcurrent_fieldset()

    data_folder = parcels.download_example_dataset("GlobCurrent_example_data")
    fnamesFeb = sorted(glob(f"{data_folder}/200202*.nc"))
    fieldset.add_field(
        parcels.Field.from_netcdf(
            fnamesFeb,
            ("P", "eastward_eulerian_current_velocity"),
            {"lat": "lat", "lon": "lon", "time": "time"},
+            deferred_load=False,
        )
    )

    MyParticle = parcels.Particle.add_variable("sample_var", initial=0.0)

    def SampleP(particle, fieldset, time):  # pragma: no cover
        particle.sample_var += fieldset.P[
            time, particle.depth, particle.lat, particle.lon
        ]

    if with_starttime:
        time = fieldset.U.grid.time[0] if dt > 0 else fieldset.U.grid.time[-1]
        pset = parcels.ParticleSet(
            fieldset, pclass=MyParticle, lon=[25], lat=[-35], time=time
        )
    else:
        pset = parcels.ParticleSet(fieldset, pclass=MyParticle, lon=[25], lat=[-35])

    if with_starttime:
        with pytest.raises(parcels.TimeExtrapolationError):
            pset.execute(
                pset.Kernel(parcels.AdvectionRK4) + SampleP,
                runtime=timedelta(days=1),
                dt=dt,
            )
    else:
        pset.execute(
            pset.Kernel(parcels.AdvectionRK4) + SampleP,
            runtime=timedelta(days=1),
            dt=dt,
        )

============================================================================ test session starts =============================================================================
platform darwin -- Python 3.10.15, pytest-8.3.3, pluggy-1.5.0
rootdir: /Users/Hodgs004/coding/repos/parcels
configfile: pyproject.toml
testpaths: tests, docs/examples
plugins: html-4.1.1, metadata-3.1.1, nbval-0.11.0, anyio-4.6.2.post1
collected 689 items / 688 deselected / 1 selected                                                                                                                            

docs/examples/example_globcurrent.py F                                                                                                                                 [100%]

================================================================================== FAILURES ==================================================================================
_______________________________________________________ test_globcurrent_startparticles_between_time_arrays[True--300] _______________________________________________________

self = <Field>
    name            : 'P'
    grid            : RectilinearZGrid(lon=array([ 14.88,  15.12,  15.38, ...,  34.3...00:00.000000000, mesh='spherical')
    extrapolate time: False
    gridindexingtype: 'nemo'
    to_write        : False
key = (np.float64(31449600.0), np.float32(0.0), np.float32(-35.0), np.float32(25.0))

    def __getitem__(self, key):
        self._check_velocitysampling()
        try:
            if _isParticle(key):
                return self.eval(key.time, key.depth, key.lat, key.lon, key)
            else:
>               return self.eval(*key)

parcels/field.py:845: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
parcels/field.py:856: in eval
    ti = self._time_index(time)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <Field>
    name            : 'P'
    grid            : RectilinearZGrid(lon=array([ 14.88,  15.12,  15.38, ...,  34.3...00:00.000000000, mesh='spherical')
    extrapolate time: False
    gridindexingtype: 'nemo'
    to_write        : False
time = np.float64(31449600.0)

    def _time_index(self, time):
        """Find the index in the time array associated with a given time.
    
        Note that we normalize to either the first or the last index
        if the sampled value is outside the time value range.
        """
        if not self.allow_time_extrapolation and (time < self.grid.time[0] or time > self.grid.time[-1]):
>           raise TimeExtrapolationError(time, field=self)
E           parcels.tools.statuscodes.TimeExtrapolationError: P sampled outside time domain at time 2002-12-31T00:00:00.000000000. Try setting allow_time_extrapolation to True.

parcels/field.py:817: TimeExtrapolationError

During handling of the above exception, another exception occurred:

dt = -300, with_starttime = True

    @pytest.mark.parametrize("dt", [-300, 300])
    @pytest.mark.parametrize("with_starttime", [True, False])
    def test_globcurrent_startparticles_between_time_arrays(dt, with_starttime):
        fieldset = set_globcurrent_fieldset()
    
        data_folder = parcels.download_example_dataset("GlobCurrent_example_data")
        fnamesFeb = sorted(glob(f"{data_folder}/200202*.nc"))
        fieldset.add_field(
            parcels.Field.from_netcdf(
                fnamesFeb,
                ("P", "eastward_eulerian_current_velocity"),
                {"lat": "lat", "lon": "lon", "time": "time"},
                deferred_load=False,
            )
        )
    
        MyParticle = parcels.Particle.add_variable("sample_var", initial=0.0)
    
        def SampleP(particle, fieldset, time):  # pragma: no cover
            particle.sample_var += fieldset.P[
                time, particle.depth, particle.lat, particle.lon
            ]
    
        if with_starttime:
            time = fieldset.U.grid.time[0] if dt > 0 else fieldset.U.grid.time[-1]
            pset = parcels.ParticleSet(
                fieldset, pclass=MyParticle, lon=[25], lat=[-35], time=time
            )
        else:
            pset = parcels.ParticleSet(fieldset, pclass=MyParticle, lon=[25], lat=[-35])
    
        if with_starttime:
            with pytest.raises(parcels.TimeExtrapolationError):
>               pset.execute(
                    pset.Kernel(parcels.AdvectionRK4) + SampleP,
                    runtime=timedelta(days=1),
                    dt=dt,
                )

docs/examples/example_globcurrent.py:277: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
parcels/particleset.py:1051: in execute
    res = self._kernel.execute(self, endtime=next_time, dt=dt)
parcels/kernel.py:349: in execute
    self.evaluate_particle(p, endtime)
parcels/kernel.py:421: in evaluate_particle
    res = self._pyfunc(p, self._fieldset, p.time_nextloop)
<ast>:2: in SetcoordsAdvectionRK4SamplePUpdatecoords
    ???
parcels/field.py:847: in __getitem__
    return _deal_with_errors(error, key, vector_type=None)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

error = TimeExtrapolationError('P sampled outside time domain at time 2002-12-31T00:00:00.000000000. Try setting allow_time_extrapolation to True.')
key = (np.float64(31449600.0), np.float32(0.0), np.float32(-35.0), np.float32(25.0)), vector_type = None

    def _deal_with_errors(error, key, vector_type: VectorType):
        if _isParticle(key):
            key.state = AllParcelsErrorCodes[type(error)]
        elif _isParticle(key[-1]):
            key[-1].state = AllParcelsErrorCodes[type(error)]
        else:
>           raise RuntimeError(f"{error}. Error could not be handled because particle was not part of the Field Sampling.")
E           RuntimeError: P sampled outside time domain at time 2002-12-31T00:00:00.000000000. Try setting allow_time_extrapolation to True.. Error could not be handled because particle was not part of the Field Sampling.

parcels/field.py:70: RuntimeError
---------------------------------------------------------------------------- Captured stdout call ----------------------------------------------------------------------------
  0%|          | 0/86400.0 [00:00<?, ?it/s]
============================================================================== warnings summary ==============================================================================
docs/examples/example_globcurrent.py::test_globcurrent_startparticles_between_time_arrays[True--300]
  <frozen importlib._bootstrap>:241: RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 16 from C header, got 96 from PyObject

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
========================================================================== short test summary info ===========================================================================
FAILED docs/examples/example_globcurrent.py::test_globcurrent_startparticles_between_time_arrays[True--300] - RuntimeError: P sampled outside time domain at time 2002-12-31T00:00:00.000000000. Try setting allow_time_extrapolation to True.. Error could not be handled because part...
================================================================ 1 failed, 688 deselected, 1 warning in 2.03s ================================================================

@erikvansebille
Copy link
Member

This is strange. Th test case is expected to throw a TimeExtrapolationError, right? See the with pytest.raises(parcels.TimeExtrapolationError) In the line above.Shouldn't the question be: why is this not captured by pytest? Is it because it now raises a RunTimeEroor, instead of a TimeExtrapolationError?

@VeckoTheGecko
Copy link
Contributor Author

Is it because it now raises a RunTimeEroor, instead of a TimeExtrapolationError?

This is half the story. pytest -k 'test_globcurrent_startparticles_between_time_arrays[False-300]' is still failing, and that's with start_time=False so its not expected to raise any errors...

@VeckoTheGecko
Copy link
Contributor Author

VeckoTheGecko commented Mar 4, 2025

Going ahead and disabling test_globcurrent_startparticles_between_time_arrays and disabling test_kernel_warnings. The latter was failing - the particle is going in the wrong direction which is most likely a consequence of _set_scaling_factor not working as expected with the new eager loading. As this functionality is going to be removed anyway, I'm going to mark as failing for now so it can be re-enabled once set_scaling_factor is cleared out.

debugging info

By adding print statements to the VectorField.getitem method we get:

Failing case:
key P[0](lon=0.000000, lat=0.000000, depth=0.000000, time=0.000000)
key (np.float64(0.5), np.float64(-0.4949999749660492), np.float64(0.0), np.float64(0.0), P[0](lon=0.000000, lat=0.000000, depth=0.000000, time=0.000000))
key (np.float64(0.5), np.float64(0.0), np.float64(0.0), np.float64(0.0), P[0](lon=0.000000, lat=0.000000, depth=0.000000, time=0.000000))
key (np.float64(1.0), np.float64(-0.9899999499320984), np.float64(0.0),


Passing case:
key P[0](lon=0.000000, lat=0.000000, depth=0.000000, time=0.000000)
key (np.float64(0.5), np.float64(0.4949999749660492), np.float64(0.0), np.float64(0.0), P[0](lon=0.000000, lat=0.000000, depth=0.000000, time=0.000000))
key (np.float64(0.5), np.float64(0.4949999749660492), np.float64(0.0), np.float64(0.0), P[0](lon=0.000000, lat=0.000000, depth=0.000000, time=0.000000))
key (np.float64(1.0), np.float64(0.9899999499320984), np.float64(0.0), np.float64(0.0), P[0](lon=0.000000, lat=0.000000, depth=0.000000, time=0.000000))

...

I think the 2nd element of the tuple is the depth in question.

@VeckoTheGecko VeckoTheGecko marked this pull request as ready for review March 4, 2025 15:29
@VeckoTheGecko VeckoTheGecko enabled auto-merge March 5, 2025 10:11
@VeckoTheGecko VeckoTheGecko merged commit 8f0565e into v4-dev Mar 5, 2025
16 checks passed
@VeckoTheGecko VeckoTheGecko deleted the deferred_load branch March 5, 2025 10:39
@github-project-automation github-project-automation bot moved this from Backlog to Done in Parcels v4 release Mar 5, 2025
@github-project-automation github-project-automation bot moved this from Backlog to Done in Parcels development Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Done
Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants