Skip to content

Conversation

Huite
Copy link
Contributor

@Huite Huite commented Sep 24, 2025

For @jdelsman

Sample script:

# %%

import imod

# %%
path = "Z:/tmp/joost/v5_composiet_v1b_v2_nodsp_aut/v5_composiet_v1b_v2_nodsp.toml"
sim = imod.mf6.Modflow6Simulation.from_file(path)
submodel_labels = sim.create_partition_labels(npartitions=24)

# %%

sim_partitions = sim.split(submodel_labels, ignore_time_purge_empty=True)

The sim.split() takes about 70 seconds here, which seems reasonable (compared to 2 hours before?).

Huite added 2 commits September 24, 2025 21:05
In a nutshell: make sure DIS is in RAM, avoid re-loading data over and over, avoid unnecessary work.

Refactoring model splitter: Create a class to re-use data where possible.

* DO NOT PURGE AFTER PARTITION MODEL CREATION. If data is spatially unchunked, this forces loading the entire dataset into memory. This means performance degrades linearly with the number of partitions (since each partition requires a separate load) unless all data is loaded into memory at start (not feasible for large models).
* This implementation runs a groupby instead on the unpartitioned data (i.e. a single load) and counts the number of active elements. If a package has zero active elements, it is omitted. Some dispatching for point and line data (matches clipping logic).
* Avoid looking for paired transport models; mapping is fully known a priori due to same partioning labels; keep them together in a NamedTuple (with some helpers).
* Added some trailing returns (matter of taste)
* Seems to increase performance from 2 hours (Joost) to 1 minute (me locally).

The performance unfortunately still gets worse with each partition when writing the model. In principle, chunking fixes this (but requires data locality of unstructured meshes, can be done, see xugrid issues); alternatively, we might get dask to optimize writing. Needs research.
@Huite Huite marked this pull request as draft September 24, 2025 19:16
@Huite
Copy link
Contributor Author

Huite commented Sep 26, 2025

This script will re-order the data to a locality preserving order, and them dump everything to TOML + Zarr:

# %%

import numpy as np
import imod
import xarray as xr

# %%

def morton_encode_2d(x, y):
    """
    Encode 2D coordinates to Morton codes.
    Uses 32 bits per dimension for 64-bit Morton codes.

    Parameters
    ----------
    x: np.ndarray of floats
    y: np.ndarray of floats

    Returns
    -------
    codes: np.ndarray
        Morton codes as uint64
    """
    x = np.asarray(x, dtype=np.float64)
    y = np.asarray(y, dtype=np.float64)

    min_x, max_x = np.min(x), np.max(x)
    min_y, max_y = np.min(y), np.max(y)

    # Add small padding to avoid edge cases
    x_range = max_x - min_x
    y_range = max_y - min_y
    padding = 1e-12  # Smaller padding for maximum resolution

    if x_range == 0:
        x_range = 1.0
        padding = 0.5
    if y_range == 0:
        y_range = 1.0
        padding = 0.5

    min_x -= x_range * padding
    max_x += x_range * padding
    min_y -= y_range * padding
    max_y += y_range * padding

    # Scale coordinates to [0, 2^32 - 1] for maximum resolution
    max_coord = 0xFFFFFFFF  # 2^32 - 1

    x_scaled = ((x - min_x) / (max_x - min_x) * max_coord).astype(np.uint64)
    y_scaled = ((y - min_y) / (max_y - min_y) * max_coord).astype(np.uint64)

    # Clamp to valid range
    x_scaled = np.clip(x_scaled, 0, max_coord)
    y_scaled = np.clip(y_scaled, 0, max_coord)

    # Dilate bits for 32-bit values - maximum resolution bit interleaving
    def dilate_bits_32(vals):
        """Dilate 32-bit values by inserting zeros between bits"""
        vals = vals & 0xFFFFFFFF  # Ensure 32-bit
        vals = (vals | (vals << 32)) & 0x00000000FFFFFFFF
        vals = (vals | (vals << 16)) & 0x0000FFFF0000FFFF
        vals = (vals | (vals << 8)) & 0x00FF00FF00FF00FF
        vals = (vals | (vals << 4)) & 0x0F0F0F0F0F0F0F0F
        vals = (vals | (vals << 2)) & 0x3333333333333333
        vals = (vals | (vals << 1)) & 0x5555555555555555
        return vals

    # Dilate and interleave: x gets odd positions, y gets even positions
    x_dilated = dilate_bits_32(x_scaled)
    y_dilated = dilate_bits_32(y_scaled)

    # Interleave by shifting y left by 1 and OR-ing with x
    morton_codes = x_dilated | (y_dilated << 1)

    return morton_codes


def reorder_morton(data):
    grid = data.ugrid.grid
    x, y = grid.centroids.T
    codes = morton_encode_2d(x, y)
    order = np.argsort(codes)
    return data.isel({grid.face_dimension: order})


def reorder_and_rechunk_simulation(
    sim: imod.mf6.Modflow6Simulation,
    dimension: str = "mesh2d_nFaces",
    chunks: dict[str, int]=None,
):
    if chunks is None:
        chunks = {"time": 10, dimension: 20_000}

    for model in sim.values():
        if isinstance(model, imod.mf6.model.Modflow6Model):
            for package in model.values():
                dims = package.dataset.dims
                if dimension in dims:
                    package_chunks = {k: v for k, v in chunks.items() if k in dims}
                    package.dataset = reorder_morton(package.dataset)
                    # Set encoding chunks which will end up in the zarr file.
                    if package_chunks:
                        package.dataset.encoding = {"chunks": package_chunks}

    return



# %%

path = "Z:/tmp/joost/v5_composiet_v1b_v2_nodsp_aut/v5_composiet_v1b_v2_nodsp.toml"
sim = imod.mf6.Modflow6Simulation.from_file(path)
reorder_and_rechunk_simulation(sim)
sim.dump("rechunked_zarrsim", validate=False, engine="zarr")
 

@Huite
Copy link
Contributor Author

Huite commented Sep 26, 2025

Annoyingly, I get an error inside the function calls that .item() isn't supported on dask arrays. Though when I access the variable through the REPL it seems to work fine. (EDIT: the reason this works is almost certainly because the REPL calls the __repr__ method which forces a compute somewhere.) Also, it becomes unhappy when it comes across the well package. Anyway, we can sidestep this by forcing all non spatial data into memory:

# %%
import imod

# %%

def load_non_spatial_data(sim, dimension: str = "mesh2d_nFaces"):
    for model in sim.values():
        if isinstance(model, imod.mf6.model.Modflow6Model):
            for package in model.values():
                if dimension not in package.dataset.dims:
                    package.dataset.load()
                for var in package.dataset.data_vars:
                    if dimension not in package.dataset[var].dims:
                        package.dataset[var].load()
        else:
            model.dataset.load()
    return

# %%

path = "rechunked-simulation/v5_composiet_v1b_v2_nodsp.toml"
sim = imod.mf6.Modflow6Simulation.from_file(path)
load_non_spatial_data(sim)
sim._validation_context = imod.mf6.ValidationSettings(strict_well_validation=False, ignore_time=True)

# %%

# Serial test
# sim.write("serial_simulation", validate=False)

submodel_labels = sim.create_partition_labels(npartitions=8)
sim_partitions = sim.split(submodel_labels, ignore_time_purge_empty=True)

sim_partitions._validation_context = imod.mf6.ValidationSettings(strict_well_validation=False, ignore_time=True)
sim_partitions.write("partitioned_simulation", validate=False)
# %%

For future to do's: the morton curve ordering should become a part of xugrid. See: Deltares/xugrid#389

It may make sense to wrap the ordering on the simulation levels, as a method instead of a free function.
We should investigate the zarr stuff; I'm running into an issue here where I cannot re-open a zip store of the recharge while the other packages work fine; the only reason I can think of now is the size. We also need to investigate the .item() thing which seems related to zarr as well -- I think zarr causes more aggressive chunking, whereas the netCDF engine might treat scalars somewhat differently? Not sure.

Copy link

@Huite
Copy link
Contributor Author

Huite commented Sep 27, 2025

Some general thoughts (I've had a little time to mull it over); we a minior optimization in the from_file method, which is to simply load all variables that are non-spatial. For these variables, having it dask backed will only increase overhead. This'll also solve the .item() issue.

Secondly, in the get_unfiltered_options; we should probably use is_scalar as a predicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🏗 In Progress

Development

Successfully merging this pull request may close these issues.

1 participant