Skip to content

Conversation

@erikvansebille
Copy link
Member

@erikvansebille erikvansebille commented Sep 10, 2025

This PR converts the Argo tutorial from v3 to v4.

It still has a few (breaking) issues that will need to be resolved:

  • [BREAKING] The output zarr file can't be parsed by cftime. Getting an error ValueError: Failed to decode variable 'time': unable to decode time units 'seconds since 2002-01-01T00:00:00' with "calendar 'standard'". Try opening your dataset with decode_times=False or installing cftime if it is not installed which is strange because seconds since 2002-01-01T00:00:00 seems a perfectly fine time unit?
  • [BREAKING] The logic in the ArgoVerticalMovement() kernel doesn't work. Particle properties, including cycle_phase, are not updated as expected. This likely has to do with the current implementation of KernelParticle, which is rather ad-hoc
  • After loading the GlobCurrent_example_data as a DataSet, we have to recalculate the ds["time"], otherwise we get a ValueError: Expected right to be a np.timedelta64, datetime, cftime.datetime, or np.datetime64. Got <class 'numpy.float64'>. Error getting time interval for field 'U'. Are you sure that the time dimension on the xarray dataset is stored as timedelta, datetime or cftime datetime objects?
  • Performance can be improved. Running the notebook without an explicit ds.load() is extremely slow (> 15 minutes) compared to the ~70 seconds when the full DataSet is loaded in memory.
  • The particle drift_age and other calendar-based properties can't yet be converted to np.timedelta because that gives an error in output_file writing (KeyError: dtype('<m8[s]') in attrs["time"].update(_get_calendar_and_units(time_interval)))
  • Wrap some of the lines for creating the Fieldset from the DataSet into a method (FieldSet.from_GlobCurrent or something?)
  • CI has to be turned on for this notebook so that it gets tested/run

UPDATE: I now swapped to Copernicus Marine data; which solved a few of the major issues above

Based on changes needed to the Argo tutorial
And also removing the period at the end, which created parsing issues in VisualStudioCode (which thought it was part of the path)
Note that there are multiple breaking problems in this file
@VeckoTheGecko
Copy link
Contributor

[ ] [BREAKING] The output zarr file can't be parsed by cftime. Getting an error ValueError: Failed to decode variable 'time': unable to decode time units 'seconds since 2002-01-01T00:00:00' with "calendar 'standard'". Try opening your dataset with decode_times=False or installing cftime if it is not installed which is strange because seconds since 2002-01-01T00:00:00 seems a perfectly fine time unit?

xref #2180

@erikvansebille
Copy link
Member Author

OK, so I found out what is going wrong with the updating of the cycle_phase (second bullet point above). The issue is that mixing boolean masks within a Kernel doesn't work. But if we separate each boolean mask (i.e. phase) into its own kernel, it does work

I also created a minimal breaking example of this in the test_particleset_execute/py file:
https://github.com/OceanParcels/Parcels/blob/9a6abf553e8f2c1f4945015d489d9be470910010/tests/v4/test_particleset_execute.py#L367-L407

In this case, the test cases running Lat1, Lat2 and [Lat1, Lat2] work, and only Lat1and2 fails. That is because it is the only one with two boolean masks in the Kernel body. This might help us refactor/fix up the KernelParticle (see also #2143)?

@VeckoTheGecko
Copy link
Contributor

VeckoTheGecko commented Sep 12, 2025

A more minimal example would be the following

diff --git a/tests/v4/test_particleset_execute.py b/tests/v4/test_particleset_execute.py
index 871a1309..fe945436 100644
--- a/tests/v4/test_particleset_execute.py
+++ b/tests/v4/test_particleset_execute.py
@@ -407,6 +407,18 @@ def test_execution_update_particle_in_kernel_function(fieldset, kernel_names, ex
     np.testing.assert_allclose(pset.lat, expected, rtol=1e-5)
 
 
+def test_execution_changing_particle_mask_bug(fieldset):
+    npart = 2
+
+    def kernel(particles, fieldset):
+        for i in range(1, 4):
+            print(f"before iteration {i}: {particles.lon=}, {particles._index=}")
+            particles[particles.lon < 0.5]
+
+    pset = ParticleSet(fieldset, lon=np.linspace(0, 1, npart), lat=np.zeros(npart))
+    pset.execute([kernel], runtime=np.timedelta64(2, "s"), dt=np.timedelta64(1, "s"))
+
+
 def test_uxstommelgyre_pset_execute():
     ds = datasets_unstructured["stommel_gyre_delaunay"]
     grid = UxGrid(grid=ds.uxgrid, z=ds.coords["nz"], mesh="spherical")
Output

(parcels-v4-dev) [hodgs004@node09 parcels]$ pytest 'tests/v4/test_particleset_execute.py::test_execution_changing_particle_mask_bug' -s
=================================================================================== test session starts ====================================================================================
platform linux -- Python 3.11.13, pytest-8.4.2, pluggy-1.6.0
rootdir: /storage/home/hodgs004/coding/repos/parcels
configfile: pyproject.toml
plugins: anyio-4.10.0, hypothesis-6.138.15, metadata-3.1.1, html-4.1.1, nbval-0.11.0
collected 1 item                                                                                                                                                                           

  0%|                                                                                                                                                               | 0/2.0 [00:00<?, ?it/s]
before iteration 1: particles.lon=array([0., 1.], dtype=float32), particles._index=array([ True,  True])
before iteration 2: particles.lon=array([0.], dtype=float32), particles._index=array([ True, False])
F

========================================================================================= FAILURES =========================================================================================
________________________________________________________________________ test_execution_changing_particle_mask_bug _________________________________________________________________________

fieldset = <parcels.fieldset.FieldSet object at 0x7f8834154a90>

    def test_execution_changing_particle_mask_bug(fieldset):
        npart = 2
    
        def kernel(particles, fieldset):
            for i in range(4):
                print(f"before iteration {i}: {particles.lon=}, {particles._index=}")
                particles[particles.lon < 0.5]
    
        pset = ParticleSet(fieldset, lon=np.linspace(0, 1, npart), lat=np.zeros(npart))
>       pset.execute([kernel], runtime=np.timedelta64(2, "s"), dt=np.timedelta64(1, "s"))

tests/v4/test_particleset_execute.py:423: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
parcels/particleset.py:556: in execute
    self._kernel.execute(self, endtime=next_time, dt=dt)
parcels/kernel.py:261: in execute
    f(pset[evaluate_particles], self._fieldset)
tests/v4/test_particleset_execute.py:419: in kernel
    print(f"before iteration {i}: {particles.lon=}, {particles._index=}")
                                   ^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <parcels.particle.KernelParticle object at 0x7f8858605050>, name = 'lon'

    def __getattr__(self, name):
>       return self._data[name][self._index]
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E       IndexError: boolean index did not match indexed array along axis 0; size of axis is 2 but size of corresponding boolean axis is 1

parcels/particle.py:133: IndexError
================================================================================= short test summary info ==================================================================================
FAILED tests/v4/test_particleset_execute.py::test_execution_changing_particle_mask_bug - IndexError: boolean index did not match indexed array along axis 0; size of axis is 2 but size of corresponding boolean axis is 1
==================================================================================== 1 failed in 6.49s =====================================================================================
  0%|                  

Its to do with particles[particles.lon < 0.5] and that the return value of partilces.lon is based on _index, which is set in [] creating this sort of circular dependency. I don't know whether this sort of subsetting of particle collections and passing them to functions is something that we can support (since we conceptually need to "reset" back to the particle collection that is being run - not propogate that state to future calls) - and I don't see a clear way that we can do that resetting in this API.

@erikvansebille
Copy link
Member Author

The CI for this PR fails because of the unit test introduced in dcdd2d1.

This is intentional though (until we fix #2143), so perhaps we should move that dcdd2d1 commit to another branch and merge this one?

@VeckoTheGecko, can you help out if you agree?

@VeckoTheGecko
Copy link
Contributor

Yes, agreed that we should just deal with this in another PR. Can we just do the following rather than moving the test?

@pytest.mark.xfail(reason="Will be fixed alongside https://github.com/OceanParcels/Parcels/issues/2143 . Failing due to https://github.com/OceanParcels/Parcels/pull/2199#issuecomment-3285278876 .")

Since w now have fieldset.from_copernicusmarine(), the loading is much easier.

Advantage is that we can then also show temperature sampling!
@erikvansebille erikvansebille marked this pull request as ready for review September 25, 2025 08:42
Copy link
Contributor

@VeckoTheGecko VeckoTheGecko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Note that we are yet to re-enable notebooks in CI - let's do that in a future PR (it becomes much easier to do in CI and locally once #2205 is fixed)

@erikvansebille erikvansebille merged commit a4cb13d into v4-dev Sep 25, 2025
9 checks passed
@erikvansebille erikvansebille deleted the v4-argo-float-tutorial branch September 25, 2025 13:27
@github-project-automation github-project-automation bot moved this from Backlog to Done in Parcels development Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Create a vectorised version of the Argo floats kernel

3 participants