Skip to content

Commit

Permalink
Lazy rectilinear interpolator (#6084)
Browse files Browse the repository at this point in the history
* lazy interpolation using map_complete_blocks

* pre-commit fixes

* replace test on interpolation with lazy data

* Update lib/iris/analysis/_interpolation.py

Co-authored-by: Martin Yeo <[email protected]>

* Update lib/iris/analysis/_interpolation.py

Co-authored-by: Martin Yeo <[email protected]>

* resume local import

* add entry to latest.rst

* add author name to list

* drop duplicated method

* new signature of map_complete_blocks

* update docstrings on lazy data

* update userguide with lazy interpolator

* the unstructured NN regridder does not support lazy data

* remove caching an interpolator

* update what's new entry

* remove links to docs section about caching interpolators

---------

Co-authored-by: Martin Yeo <[email protected]>
  • Loading branch information
fnattino and trexfeathers authored Jan 27, 2025
1 parent 7cf9c5f commit 38ae1d9
Show file tree
Hide file tree
Showing 5 changed files with 139 additions and 131 deletions.
70 changes: 15 additions & 55 deletions docs/src/userguide/interpolation_and_regridding.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@ The following are the regridding schemes that are currently available in Iris:
* point in cell regridding (:class:`iris.analysis.PointInCell`) and
* area-weighted regridding (:class:`iris.analysis.AreaWeighted`, first-order conservative).

The linear, nearest-neighbor, and area-weighted regridding schemes support
lazy regridding, i.e. if the source cube has lazy data, the resulting cube
will also have lazy data.
The linear and nearest-neighbour interpolation schemes, and the linear, nearest-neighbour,
and area-weighted regridding schemes support lazy regridding, i.e. if the source cube has lazy data,
the resulting cube will also have lazy data.
See :doc:`real_and_lazy_data` for an introduction to lazy data.
See :doc:`../further_topics/which_regridder_to_use` for a more in depth overview of the different regridders.

Expand Down Expand Up @@ -194,46 +194,6 @@ For example, to mask values that lie beyond the range of the original data:
[-- 494.44451904296875 588.888916015625 683.333251953125 777.77783203125
872.2222290039062 966.666748046875 1061.111083984375 1155.555419921875 --]


.. _caching_an_interpolator:

Caching an Interpolator
^^^^^^^^^^^^^^^^^^^^^^^

If you need to interpolate a cube on multiple sets of sample points you can
'cache' an interpolator to be used for each of these interpolations. This can
shorten the execution time of your code as the most computationally
intensive part of an interpolation is setting up the interpolator.

To cache an interpolator you must set up an interpolator scheme and call the
scheme's interpolator method. The interpolator method takes as arguments:

#. a cube to be interpolated, and
#. an iterable of coordinate names or coordinate instances of the coordinates that are to be interpolated over.

For example:

>>> air_temp = iris.load_cube(iris.sample_data_path('air_temp.pp'))
>>> interpolator = iris.analysis.Nearest().interpolator(air_temp, ['latitude', 'longitude'])

When this cached interpolator is called you must pass it an iterable of sample points
that have the same form as the iterable of coordinates passed to the constructor.
So, to use the cached interpolator defined above:

>>> latitudes = np.linspace(48, 60, 13)
>>> longitudes = np.linspace(-11, 2, 14)
>>> for lat, lon in zip(latitudes, longitudes):
... result = interpolator([lat, lon])

In each case ``result`` will be a cube interpolated from the ``air_temp`` cube we
passed to interpolator.

Note that you must specify the required extrapolation mode when setting up the cached interpolator.
For example::

>>> interpolator = iris.analysis.Nearest(extrapolation_mode='nan').interpolator(cube, coords)


.. _regridding:

Regridding
Expand Down Expand Up @@ -417,24 +377,24 @@ In each case ``result`` will be the input cube regridded to the grid defined by
the target grid cube (in this case ``rotated_psl``) that we used to define the
cached regridder.

Regridding Lazy Data
^^^^^^^^^^^^^^^^^^^^
Interpolating and Regridding Lazy Data
--------------------------------------

If you are working with large cubes, especially when you are regridding to a
high resolution target grid, you may run out of memory when trying to
regrid a cube. When this happens, make sure the input cube has lazy data
If you are working with large cubes, you may run out of memory when trying to
interpolate or regrid a cube. For instance, this might happen when regridding to a
high resolution target grid. When this happens, make sure the input cube has lazy data

>>> air_temp = iris.load_cube(iris.sample_data_path('A1B_north_america.nc'))
>>> air_temp
<iris 'Cube' of air_temperature / (K) (time: 240; latitude: 37; longitude: 49)>
>>> air_temp.has_lazy_data()
True

and the regridding scheme supports lazy data. All regridding schemes described
here support lazy data. If you still run out of memory even while using lazy
data, inspect the
`chunks <https://docs.dask.org/en/latest/array-chunks.html>`__
:
and the interpolation or regridding scheme supports lazy data. All interpolation and
regridding schemes described here with exception of :class:`iris.analysis.PointInCell`
(point-in-cell regridder) and :class:`iris.analysis.UnstructuredNearest` (nearest-neighbour
regridder) support lazy data. If you still run out of memory even while using lazy data,
inspect the `chunks <https://docs.dask.org/en/latest/array-chunks.html>`__ :

>>> air_temp.lazy_data().chunks
((240,), (37,), (49,))
Expand All @@ -455,6 +415,6 @@ dimension, to regrid it in 8 chunks of 30 timesteps at a time:
Assuming that Dask is configured such that it processes only a few chunks of
the data array at a time, this will further reduce memory use.

Note that chunking in the horizontal dimensions is not supported by the
regridding schemes. Chunks in these dimensions will automatically be combined
Note that chunking in the horizontal dimensions is not supported by the interpolation
and regridding schemes. Chunks in these dimensions will automatically be combined
before regridding.
7 changes: 7 additions & 0 deletions docs/src/whatsnew/latest.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,13 @@ This document explains the changes made to Iris for this release

#. N/A

#. `@fnattino`_ enabled lazy cube interpolation using the linear and
nearest-neighbour interpolators (:class:`iris.analysis.Linear` and
:class:`iris.analysis.Nearest`). Note that this implementation removes
performance benefits linked to caching an interpolator object. While this does
not break previously suggested code (instantiating and re-using an interpolator
object remains possible), this is no longer an advertised feature. (:pull:`6084`)


🔥 Deprecations
===============
Expand Down
8 changes: 2 additions & 6 deletions lib/iris/analysis/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2687,9 +2687,7 @@ def interpolator(self, cube, coords):
the given coordinates.
Typically you should use :meth:`iris.cube.Cube.interpolate` for
interpolating a cube. There are, however, some situations when
constructing your own interpolator is preferable. These are detailed
in the :ref:`user guide <caching_an_interpolator>`.
interpolating a cube.
Parameters
----------
Expand Down Expand Up @@ -2890,9 +2888,7 @@ def interpolator(self, cube, coords):
by the dimensions of the specified coordinates.
Typically you should use :meth:`iris.cube.Cube.interpolate` for
interpolating a cube. There are, however, some situations when
constructing your own interpolator is preferable. These are detailed
in the :ref:`user guide <caching_an_interpolator>`.
interpolating a cube.
Parameters
----------
Expand Down
Loading

0 comments on commit 38ae1d9

Please sign in to comment.