paraprof is a robust and MPI-parallelised algorithm for computing profile likelihood projections – or more generally for mapping out low-dimensional projections of a target function by optimizing over the remaining dimensions.
First an initial global optimization identifies promising starting points. These starting points are used to activate initial grid points in the user-selected projection space. At each active grid point the remaining parameters are profiled (optimized) out using differential evolution (DE) and/or L-BFGS-B. If the optimized target value for a grid point meets the user-set threshold, the neighboring grid points are activated. This way the set of optimized grid points dynamically expands to cover the full region of interest, e.g. the high-likelihood region. Information from already converged grid points is communicated to neighboring grid points for faster convergence.
pip install git+https://github.com/anderkve/paraprof.gitOptional extras: pip install -e ".[viz]", ".[dev]", or ".[all]".
Requires Python 3.10+, NumPy, SciPy, and mpi4py (with an MPI implementation like OpenMPI or MPICH). Matplotlib and scikit-learn are optional.
Below are 1D and 2D projections of three 4D test functions used as example log-likelihood functions. Dimensions not shown on the axes are profiled out. The white star marks the best-fit point; the contours are 68% and 95% confidence regions (assuming Wilks' theorem). Plots come from examples/run_showcase_scan.py followed by examples/make_showcase_plots.py.
Himmelblau 4D: 264,617 target-function evaluations across all three projections.
Rosenbrock 4D: 63,904 evaluations.
Levy 4D: 117,757 evaluations. The last dimension uses sin(2π·w₃) rather than sin(π·w + 1), so the (x₀, x₃) projection has denser horizontal ridges than (x₀, x₁).
from mpi4py import MPI
from paraprof import (
ProfileProjector, run_all_projections, terminate_workers, worker_main,
get_test_function,
)
comm = MPI.COMM_WORLD
myrank = comm.Get_rank()
log_likelihood, bounds, _ = get_test_function("himmelblau_4d")
projections = [
{'dims': [0, 1], 'grid_points': [50, 50]},
]
if myrank == 0:
with ProfileProjector(
target_func=log_likelihood,
bounds=bounds,
projections=projections,
roi_threshold=4.0,
pop_per_grid_point=3,
) as sampler:
comm.bcast(sampler.target_func, root=0)
results = run_all_projections(
comm=comm, sampler=sampler, projections=projections,
save_plots=True,
)
terminate_workers(comm, myrank)
else:
worker_main(comm, myrank)Run with mpiexec -n 4 python your_script.py.
More examples live in examples/. The *_advanced.py scripts show the full advanced_config layout.
Rank 0 is the master: it owns the grid, hands out optimization jobs, and tracks which cells are active. Ranks 1+ are stateless workers that evaluate target_func on demand.
A scan proceeds roughly as follows:
- Lay a regular grid over the projection dimensions; the rest are profiled at each grid point.
- Run global L-BFGS-B starts to find initial maxima. Rather than firing a fixed number of Latin-hypercube starts and hoping it was enough, paraprof runs a rolling multistart with basin detection: each converged optimum is clustered online into a registry of distinct optima, and a Bayesian stopping rule (Boender–Rinnooy Kan, restricted to region-of-interest optima) halts the stage once the expected number of undiscovered ROI optima drops below a threshold — aborting any still-running optimizations at that point so their remaining evaluations aren't wasted.
n_initial_optimizationsbecomes the upper bound. On later projections these are seeded from an in-memory pool built up by earlier projections, often skipping the global step entirely. - Anchor a DE population (or an initial point for L-BFGS-B optimization) at each promising cell. One population slot is filled with the highest-fitness past evaluation nearest the cell (proximity warm-start), so later projections inherit useful starting points.
- Optimize the profiled parameters at each active cell. L-BFGS-B cells also reuse the best already-converged neighbour's quasi-Newton history and its best profiled parameters as an alternative start, propagating local curvature outward.
- Activate the neighbours of high-likelihood cells, expanding the active set into the region of interest.
- Patching: re-test each cell with its neighbours' best profiled parameters and polish any improvement.
- Suspect recheck: cells whose profiled parameters look discontinuous compared to their neighbourhood are re-optimized from diverse seed points. This breaks contiguous wrong-optimum strips that the fitness-only filter of the patching step cannot fix.
- Optionally refine the grid: interpolate the coarse grid to warm-start a finer one.
Key code paths: ProfileProjector (sampler.py) holds state and configuration, master_main() (master.py) is the state machine, worker_main() (worker.py) is the evaluation loop, and jobs/ contains the asynchronous optimization jobs.
Common constructor arguments:
roi_threshold— region-of-interest cutoff in log-likelihood; cells withlogL > global_max - roi_thresholdare inside the ROI. Default 4.0.pop_per_grid_point— DE population size per cell. Default 3.n_initial_optimizations— cap on global L-BFGS-B starts before grid optimization. Defaultmin(400, 50 * n_dims): a safe ceiling, since the Bayesian stopping rule controls the actual spend, so set it generously. (If you disable early stopping withbasin_detection.undiscovered_threshold = 0, this becomes a fixed count, so set it explicitly.)max_patching_waves— cap on patching iterations. Default 10.lbfgsb_max_iter,lbfgsb_polish— L-BFGS-B iteration cap and whether to polish DE results. Defaults 50 andTrue.n_optima— prior on how many optima the target has globally; use only when sure it has one or a few. Stops the initial multistart as soon as that many distinct optima are found (the global max is then among them, so themin_startsfloor is skipped —n_optima=1stops after the first converged start), or keeps it running until at least that many are found. Pass anint(exact) or{'min': int, 'max': int}. Big saver on multimodal targets where the Bayesian rule would otherwise run to the cap.initial_points— explicit starting points to activate; useful when you already know where the good regions are.use_clustering— detect multiple modes during refinement. DefaultTrue.refinement_direct_eval— skip optimization in the refinement run and just evaluate the interpolated point. DefaultFalse.samples_output_file— path to log every evaluation as[params…, logL, phase]; thephasecolumn tags which algorithm stage produced each point (see Sample file formats). The format is chosen from the extension:.csvfor plain text or.h5/.hdf5for HDF5 binary (needs the optionalh5pydependency).warm_start_file— sample file read at the start of each projection to pre-populateinitial_maxima. Any supported format is accepted (the extension selects the reader). Set this equal tosamples_output_fileto feed a run's samples into the next one.grad_func— analytic gradient (see below).
By default L-BFGS-B uses finite differences, which costs 2N (central difference) or N (forward difference) extra calls per gradient. If you have an analytic gradient, pass it via grad_func:
def target(p):
return -float(np.sum(p**2)) # log-likelihood, maximized
def grad(p):
return -2.0 * np.asarray(p) # ∇target_func
sampler = ProfileProjector(target_func=target, grad_func=grad, ...)grad_func returns the gradient of the function being maximized; ParaProf negates internally. Getting the sign wrong sends L-BFGS-B uphill.
You can return either a length-n_dims array or a {dim_index: value} dict for partial gradients. Entries that are NaN/±inf (or dims missing from the dict) fall back to finite differences using lbfgsb.gradient_method.
Only the L-BFGS-B paths use the gradient; DE is gradient-free. sampler.target_calls_saved_by_user_gradient and sampler.user_gradient_errors track usage and fallbacks.
Each evaluated point is one row of the sample file: [params…, logL, phase] — the n_dims parameter values, the target value, and an integer phase tag for the algorithm stage that produced the point. The whole run logs to one file; filtering on the phase column recovers any subset (e.g. the volume-sampling points). The format follows the samples_output_file extension:
| Extension | Format | Notes |
|---|---|---|
.csv (default) |
plain text | Headerless, %.10e columns. No extra dependencies. |
.h5 / .hdf5 |
HDF5 binary | ~Half the size and faster I/O. Requires h5py (pip install paraprof[hdf5]). |
Phase values (paraprof.PHASE_*, with descriptions in paraprof.PHASE_LEGEND):
| Phase | Constant | Meaning |
|---|---|---|
0 |
PHASE_INITIAL |
initial optimization (global maxima / basin detection) |
1 |
PHASE_SCAN |
projection-grid scan (activation, DE, grid polish) |
2 |
PHASE_REFINE |
grid refinement and patching |
3 |
PHASE_SUSPECT |
suspect-cell recheck |
4 |
PHASE_VOLUME |
volume sampling (ensemble of MCMC walkers) |
Both formats round-trip through warm_start_file, so a run can append to and re-read its own file (the warm-start reader reads params and logL by position and ignores the phase column). They differ in crash safety: CSV re-opens per flush, so a crash loses at most the un-flushed buffer, while HDF5 keeps the file open and can truncate the final chunk on a hard kill — prefer CSV if that matters more than file size.
To pool several independent runs (or convert between formats), use combine_samples. It streams chunk by chunk and may mix formats:
import glob
import numpy as np
from paraprof import combine_samples, read_samples, write_samples, PHASE_VOLUME
combine_samples(glob.glob("run_*/samples.*"), "all_samples.h5")
samples = read_samples("all_samples.h5") # (n_samples, n_dims + 2) array
n_dims = samples.shape[1] - 2
params, target, phase = samples[:, :n_dims], samples[:, n_dims], samples[:, -1]
write_samples(samples[target > target.max() - 4.0], "roi.csv") # filter by logL
volume_samples = samples[phase == PHASE_VOLUME] # filter by phaseread_samples/write_samples are the one-shot load/save pair (write_samples refuses to clobber an existing file unless overwrite=True). For very large files, iterate with paraprof.sample_io.iter_sample_batches(path) instead of loading the whole array.
A profile-likelihood scan samples only the low-dimensional profile surfaces — a vanishingly thin slice of the full good-fit volume. To also get points throughout the region of interest (or just outside it, to see why nearby regions fail), enable the volume-sampling stage that runs after the projections:
sampler = ProfileProjector(
...,
samples_output_file="samples.csv", # walkers warm-start from these
volume_sampling={
'n_walkers': 1000, # ensemble size
'n_steps': 30, # steps per walker
'output_file': "volume_samples.csv",
# 'roi_threshold': 25.0, # widen the stage's ROI to also sample
# # just outside the good-fit region
},
)The stage fills the region of interest {logL > global_max − roi_threshold} with samples spread through the volume and across fit quality (not piled up near the threshold). It runs an affine-invariant ensemble of MCMC walkers (Goodman–Weare stretch move), each pinned to a target logL level so the ensemble spans the full range. Walkers are seeded inside the converged profile grids — which bound logL from above — and warm-started from the logged scan samples. See examples/run_volume_sampling.py.
Every evaluation streams into the main sample log under PHASE_VOLUME. The stage also writes:
output_file(.csv/.h5) — the in-ROI samples, one row[params..., logL]each.summary_file(JSON) — walker/step/evaluation counts, in-ROI fraction, mean acceptance, and the logL histogram.sampler.volume_stage_result— the same in memory, plus thesamplesarray.
Useful knobs:
n_walkers— ensemble size; the output sample count scales with it. Default 1000.n_steps— steps per walker; longer chains mix better. Default 30.eval_budget— cap on total evaluations. Defaultn_walkers·(n_steps+1).sigma_frac— how tightly each walker is held to its logL level, as a fraction ofroi_threshold; smaller sharpens the logL spread but lowers acceptance. Default 0.1.roi_threshold— the stage's own ROI cutoff in logL; larger reaches just outside the good-fit region. Defaults to the projection's value.partner_level_window— restrict a walker's stretch-move partners to those near its logL level. DefaultNone.warm_start— seed walkers from logged scan samples rather than fresh draws. DefaultTrue.
Plot with plot_volume_samples(sampler.volume_stage_result, dims=(0, 1), filename=..., grid_solution=...) to scatter the samples (coloured by logL) over a 2D profile map.
The samples are MCMC draws, so they are autocorrelated — read them as well-spread coverage, not an i.i.d. draw. The top of the logL range can be under-filled (raise n_steps/n_walkers), islands the scan missed can stay missed, and in high dimensions they are representatives rather than a dense filling.
Pass an advanced_config dict for the knobs that actually move solution quality or are real iteration budgets:
| Key | Default | What it does |
|---|---|---|
memory_size |
max(grid_sizes) * 25 |
DE F/CR adaptation memory size |
convergence_threshold |
1e-6 |
DE per-cell convergence cutoff |
de.convergence_window |
3 |
Generations of no-improvement before DE declares convergence |
de.allow_early_DE_exit |
True |
Smooth-interior cells (neighbour argmax agreement) skip the DE search — 1 DE generation then polish, instead of the full window (see below). Set False for multimodal-inner targets |
de.num_generations |
100000 |
Hard cap on DE generations |
de.max_num_to_evolve |
None |
Cap on cells evolved per generation |
lbfgsb.ftol |
1e-9 |
L-BFGS-B function tolerance |
lbfgsb.gradient_method |
'forward' |
'forward' or 'central' (~50% more calls) |
clustering.* |
auto-DBSCAN | Mode detection during refinement |
cross_projection.proximity_warm_start |
True |
Swap one LHS seed for the best nearby past evaluation. |
cross_projection.pool_seeded_initial_maxima |
True |
Seed initial_maxima from the pool on later projections and skip the global L-BFGS-B starts. |
suspect_recheck.enabled |
True |
Run the suspect-cell recheck pass after patching. |
suspect_recheck.max_waves |
10 |
Cap on suspect-recheck waves. |
suspect_recheck.param_k |
3.0 |
MAD multiplier for the discontinuity threshold. Lower flags more cells. |
suspect_recheck.max_fraction |
0.25 |
Hard cap on the fraction of ROI cells flagged per wave. |
suspect_recheck.seeds_k_ring |
3 |
Max Chebyshev radius for extended-neighbour seeds. |
suspect_recheck.seeds_from_pool |
3 |
Cross-projection pool seeds tested per suspect cell. |
suspect_recheck.polish_threshold |
1e-3 |
Min logL improvement to trigger the L-BFGS-B polish. |
basin_detection.batch_size |
None |
Optimizations kept in flight at once in the rolling multistart. None = FD-aware auto (≈ n_workers / per-gradient finite-difference fan-out, floored at 2). |
basin_detection.undiscovered_threshold |
0.5 |
Stop once the expected number of undiscovered ROI optima falls below this. Higher = stops sooner; 0 disables early stopping (the stage then runs the full n_initial_optimizations). |
basin_detection.min_starts |
None |
Minimum starts before the stopping rule may fire. None = max(10, 3·n_dims) (capped at n_initial_optimizations). |
Usage: with basin detection on, set n_initial_optimizations generously — it caps the worst case, while the stopping rule keeps the actual spend proportional to how multimodal the target turns out to be. Easy targets stop early; hard ones use the budget.
Early exit from the DE search on smooth cells (de.allow_early_DE_exit, on by default). Every active grid cell normally spends at least de.convergence_window DE generations just confirming convergence — budget largely wasted on the smooth ROI interior. With de.allow_early_DE_exit=True (the default), a freshly activated cell whose in-population neighbours agree on the profiled argmax (and whose neighbour warm-start was the best activation seed) runs a single DE generation then goes straight to the L-BFGS-B polish. That generation still runs, so the exit is self-correcting — a cell that improves keeps evolving. A replicate study (examples/run_allow_early_de_exit_replicate_study.py) shows a clean win on unimodal-inner targets — Himmelblau-4D −13.7% and Rosenbrock-4D −10.9% target calls (both p < 0.01), ROI quality indistinguishable from baseline. Set de.allow_early_DE_exit=False for a genuinely multimodal inner problem (e.g. Rastrigin-4D), where one DE generation under-explores the modes and ROI quality degrades; the default suits targets where the profiled-out dimensions enter smoothly (e.g. Gaussian-constrained nuisances).
See the ProfileProjector docstring for the full structure. Several DE knobs that did not change ROI quality in benchmarking (mutation_strategy, pbest_fraction, neighbor_pull_probability, global_pool_size, patching.n_neighbors, activation.mix_ratios) are module-level constants in sampler.py and are intentionally not user-tunable.
Each projection is a dict. Required: dims (parameter indices) and grid_points (resolution per dimension). Optional: optimization_method ('de' or 'lbfgsb', default 'de'), patch_coarse_grid (default True), patch_refined_grid (default False), grid_refinement_factor (integer > 1 to enable refinement), and refinement_method (default 'linear').
With save_plots=True, paraprof writes 1D line plots, 2D heatmaps with contours, and pairwise 2D slices for higher-dimensional projections (either the max slice or marginalized). It also writes plots of the optimal profiled-parameter values across the projection grid, which is useful for seeing what the profiling actually did.
Override the defaults with a plot_settings dict:
plot_settings = {
'dpi': 300,
'filetype': 'png',
'slice_mode': 'max', # or 'all' for marginalization (3D+)
'vmin': -4.0,
'vmax': 0.0,
'plot_profiled_params': True,
'output_dir': '.', # created automatically if missing
}pytest tests/ -v
pytest tests/ -v --cov=src/paraprof --cov-report=term-missingparaprof/
├── src/paraprof/
│ ├── sampler.py # ProfileProjector (state + config)
│ ├── master.py # Master event loop
│ ├── worker.py # Worker event loop
│ ├── jobs/ # Optimization jobs (LBFGSB, DE, activation, patching)
│ ├── interpolation.py # Grid interpolation + clustering for refinement
│ ├── visualization.py # Plot helpers
│ ├── nuisance_wrapper.py # Wrap test functions with nuisance parameters
│ ├── test_functions.py # Himmelblau, Rosenbrock, Rastrigin, etc.
│ ├── exceptions.py
│ └── logger.py
├── tests/
└── examples/ # `*_advanced.py` exercises advanced_config
MIT — see LICENSE.
@software{paraprof2025,
title = {paraprof},
author = {Kvellestad, Anders},
year = {2025},
url = {https://github.com/anderkve/paraprof}
}Maintainer: Anders Kvellestad.









