Skip to content

Commit

Permalink
Merge pull request #1269 from Libensemble/release/v_1.2.2
Browse files Browse the repository at this point in the history
Release/v 1.2.2
  • Loading branch information
shuds13 authored Mar 21, 2024
2 parents 7c59373 + 68bb912 commit fdbd062
Show file tree
Hide file tree
Showing 27 changed files with 319 additions and 119 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/basic.yml
Original file line number Diff line number Diff line change
Expand Up @@ -167,4 +167,4 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: crate-ci/typos@v1.18.2
- uses: crate-ci/typos@v1.19.0
2 changes: 1 addition & 1 deletion .github/workflows/extra.yml
Original file line number Diff line number Diff line change
Expand Up @@ -250,4 +250,4 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: crate-ci/typos@v1.18.2
- uses: crate-ci/typos@v1.19.0
4 changes: 2 additions & 2 deletions .wci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ description: |
language: Python

release:
version: 1.2.1
date: 2024-02-23
version: 1.2.2
date: 2024-03-21

documentation:
general: https://libensemble.readthedocs.io
Expand Down
24 changes: 24 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,30 @@ GitHub issues are referenced, and can be viewed with hyperlinks on the `github r

.. _`github releases page`: https://github.com/Libensemble/libensemble/releases

Release 1.2.2
--------------

:Date: March 21, 2024

* Bugfix: Some `libE_specs` were not passed through correctly when added after ensemble initialization. #1264
* `platform_specs` options are now merged with detected platforms, rather than replacing. #1265
* Ensure simulation directories are created when `sim_input_dir` is specified, likewise for gen dirs. #1266

Example user functions:

* Improved structure of gpCAM generator. #1260

:Note:

* Tests were run on Linux and MacOS with Python versions 3.9, 3.10, 3.11, 3.12
* Heterogeneous workflows tested on Frontier (OLCF), Polaris (ALCF), and Perlmutter (NERSC).
* Note that tests have been recently run on Aurora (ALCF), but the system was unavailable at time of release.
* Tests were also run on Bebop and Improv LCRC systems.

:Known Issues:

* See known issues section in the documentation.

Release 1.2.1
--------------

Expand Down
2 changes: 2 additions & 0 deletions docs/data_structures/libE_specs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ libEnsemble is primarily customized by setting options within a ``LibeSpecs`` cl

**sim_input_dir** [str]:
Copy this directory's contents into the working directory upon calling the simulation function.
Forms the base of a simulation directory.

.. tab-item:: Gens

Expand All @@ -145,6 +146,7 @@ libEnsemble is primarily customized by setting options within a ``LibeSpecs`` cl

**gen_input_dir** [str]:
Copy this directory's contents into the working directory upon calling the generator function.
Forms the base of a generator directory.

.. tab-item:: Profiling

Expand Down
30 changes: 15 additions & 15 deletions docs/dev_guide/release_management/release_platforms/rel_spack.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,12 @@ A workflow for updating libEnsemble on Spack

This assumes you have already:

- made a PyPI package for new version of libEnsemble and
- made a PyPI package for the new libEnsemble version and
- made a GitHub fork of Spack and cloned it to your local system.

Details on how to create forks can be found at https://help.github.com/articles/fork-a-repo.

You now have a configuration like that shown at (but without the upstream/local connection).
https://stackoverflow.com/questions/6286571/are-git-forks-actually-git-clones.
You now have a configuration like that shown at https://stackoverflow.com/a/6286877/6346040.

Upstream, in this case, is the official Spack repository on GitHub. Origin is
your fork on GitHub, and Local Machine is your local clone (from your fork).
Expand All @@ -33,7 +32,7 @@ To set upstream repo::
git remote set-url --push upstream no_push
git remote -v # Check for line: `upstream no_push (push)`

Updating (the main develop branch)
Updating (the develop branch)
----------------------------------

You will now update your local machine from the upstream repo (if in doubt,
Expand All @@ -54,16 +53,17 @@ Fetch from the upstream repo::
To update your local machine, you may wish to rebase or overwrite your local files.
Select from the following:

If you have local changes to go "on top" of latest code::
Now make your local machine identical to the upstream repo (**WARNING:** Any local changes will be lost!)::

git rebase upstream/develop
git reset --hard upstream/develop

Or to make your local machine identical to upstream repo (**WARNING:** Any local changes will be lost!)::
Alternatively, if you have existing local changes to go "on top" of the latest
code (usually we will make our release updates after this)::

git reset --hard upstream/develop
git rebase upstream/develop

(Optional) You may want to update your forked (origin) repo on GitHub at this point.
This may requires a forced push::
This may require a forced push::

git push origin develop --force

Expand All @@ -84,11 +84,11 @@ Quick example to update libEnsemble
This will open the libEnsemble ``package.py`` file in your editor (given by
environment variable ``EDITOR``)::

spack edit py-libensemble # SPACK_ROOT must be set (see above) (python packages use "py-" prefix)
spack edit py-libensemble # SPACK_ROOT must be set (see above) (Python packages use "py-" prefix)

Or just open it manually: ``var/spack/repos/builtin/packages/py-libensemble/package.py``.

Now get checksum for new lines:
Now get the checksum for new lines:

Get the tarball (see PyPI instructions), for the new release and use::

Expand All @@ -102,13 +102,13 @@ Check package::

spack style

This will install a few python spack packages and run style checks on just
This will install a few Python Spack packages and run style checks on just
your changes. Make adjustments if needed, until this passes.

If okay - add, commit, and push to origin (forked repo). For example, if your version
number is 0.9.1::
number is 1.2.2::

git commit -am "libEnsemble: add v0.9.1"
git commit -am "libEnsemble: add v1.2.2"
git push origin develop --force

Once the branch is pushed to the forked repo, go to GitHub and do a pull request from this
Expand All @@ -117,7 +117,7 @@ branch on the fork to the develop branch on the upstream.
Express Summary: Make Fork Identical to Upstream
------------------------------------------------

Quick summary for bringing develop branch on forked repo up to speed with upstream
Quick summary for bringing the develop branch on a forked repo up to speed with upstream
(YOU WILL LOSE ANY CHANGES)::

git remote add upstream https://github.com/spack/spack.git
Expand Down
2 changes: 2 additions & 0 deletions docs/dev_guide/release_management/release_process.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ Before release
- ``MANIFEST.in`` is checked. Locally, try out ``python setup.py sdist`` and check created tarball.
contains correct files and directories for PyPI package.

- Locally check that the example code in the README works with both local comms and mpi4py.

- Tests are run with source to be released (this may iterate):

- On-line CI (GitHub Actions) tests must pass.
Expand Down
4 changes: 2 additions & 2 deletions docs/platforms/bebop.rst
Original file line number Diff line number Diff line change
Expand Up @@ -140,9 +140,9 @@ Additional Information
See the LCRC Bebop docs here_ for more information about Bebop.

.. _Anaconda: https://www.anaconda.com/
.. _Bebop: https://www.lcrc.anl.gov/systems/resources/bebop/
.. _Bebop: https://www.lcrc.anl.gov/systems/bebop
.. _conda: https://conda.io/en/latest/
.. _here: https://www.lcrc.anl.gov/for-users/using-lcrc/running-jobs/running-jobs-on-bebop/
.. _here: https://docs.lcrc.anl.gov/bebop/running-jobs-bebop/
.. _mpi4py: https://mpi4py.readthedocs.io/en/stable/
.. _options: https://slurm.schedmd.com/srun.html
.. _Slurm: https://slurm.schedmd.com/
2 changes: 1 addition & 1 deletion docs/platforms/improv.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,4 +68,4 @@ You can install mpi4py as usual having installed the Open-MPI module::
Note if using ``mpi4py`` comms with Open-MPI, you may need to set ``export OMPI_MCA_coll_hcoll_enable=0``
to prevent HCOLL warnings.

.. _Improv: https://www.lcrc.anl.gov/for-users/using-lcrc/running-jobs/running-jobs-on-improv/
.. _Improv: https://docs.lcrc.anl.gov/improv/running-jobs-improv/
3 changes: 1 addition & 2 deletions docs/platforms/platforms_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ On systems with a job scheduler, libEnsemble is typically run within a single
:doc:`job submission<example_scripts>`. All user simulations will run on
the nodes within that allocation.

*How does libensemble know where to run tasks (user applications)?*
*How does libEnsemble know where to run tasks (user applications)?*

The libEnsemble :doc:`Executor<../executor/ex_index>` can be initialized from the user calling
script, and then used by workers to run tasks. The Executor will automatically detect the nodes
Expand Down Expand Up @@ -221,7 +221,6 @@ libEnsemble on specific HPC systems.
example_scripts

.. _Balsam: https://balsam.readthedocs.io/en/latest/
.. _Cooley: https://www.alcf.anl.gov/support-center/cooley
.. _Globus Compute: https://www.globus.org/compute
.. _Globus Compute endpoints: https://globus-compute.readthedocs.io/en/latest/endpoints.html
.. _Globus: https://www.globus.org/
Expand Down
2 changes: 1 addition & 1 deletion docs/resource_manager/resource_detection.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,4 +40,4 @@ undesirably scheduled to the same nodes.
System detection for resources can be overridden using the :ref:`resource_info<resource_info>`
:class:`libE_specs<libensemble.specs.LibeSpecs>` option.

.. _Cooley: https://www.alcf.anl.gov/support-center/cooley
.. _Cooley: https://www.alcf.anl.gov/alcf-resources/cooley
2 changes: 1 addition & 1 deletion install/misc_feature_requirements.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
globus-compute-sdk==2.13.0
globus-compute-sdk==2.15.0
6 changes: 3 additions & 3 deletions install/testing_requirements.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
flake8==7.0.0
coverage==7.3.1
pytest==8.0.1
pytest==8.1.1
pytest-cov==4.1.0
pytest-timeout==2.2.0
pytest-timeout==2.3.1
mock==5.1.0
python-dateutil==2.8.2
python-dateutil==2.9.0.post0
anyio==4.3.0
matplotlib==3.8.3
10 changes: 8 additions & 2 deletions libensemble/ensemble.py
Original file line number Diff line number Diff line change
Expand Up @@ -326,8 +326,14 @@ def libE_specs(self, new_specs):
return

# Cast new libE_specs temporarily to dict
if not isinstance(new_specs, dict):
new_specs = specs_dump(new_specs, by_alias=True, exclude_none=True, exclude_unset=True)
if not isinstance(new_specs, dict): # exclude_defaults should only be enabled with Pydantic v2
platform_specs_set = False
if new_specs.platform_specs != {}: # bugginess across Pydantic versions for recursively casting to dict
platform_specs_set = True
platform_specs = new_specs.platform_specs
new_specs = specs_dump(new_specs, exclude_none=True, exclude_defaults=True)
if platform_specs_set:
new_specs["platform_specs"] = specs_dump(platform_specs, exclude_none=True)

# Unset "comms" if we already have a libE_specs that contains that field, that came from parse_args
if new_specs.get("comms") and hasattr(self._libE_specs, "comms") and self.parsed:
Expand Down
39 changes: 28 additions & 11 deletions libensemble/executors/mpi_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ def __init__(self, run_command="mpiexec", platform_info=None):
self.arg_nnodes = ("--LIBE_NNODES_ARG_EMPTY",)
self.arg_ppn = ("--LIBE_PPN_ARG_EMPTY",)
self.default_mpi_options = None
self.default_gpu_arg = None
self.default_gpu_args = None
self.default_gpu_arg_type = None
self.platform_info = platform_info

Expand Down Expand Up @@ -126,16 +126,32 @@ def _set_gpu_env_var(self, wresources, task, gpus_per_node, gpus_env):

def _local_runner_set_gpus(self, task, wresources, extra_args, gpus_per_node, ppn):
"""Set default GPU setting for MPI runner"""
if self.default_gpu_arg is not None:
arg_type = self.default_gpu_arg_type

arg_type = self.default_gpu_arg_type
if arg_type is not None:
gpu_value = gpus_per_node // ppn if arg_type == "option_gpus_per_task" else gpus_per_node
gpu_setting_name = self.default_gpu_arg
gpu_setting_name = self.default_gpu_args[arg_type]
jassert(gpu_setting_name is not None, f"No default gpu_setting_name for {arg_type}")
extra_args = self._set_gpu_cli_option(wresources, extra_args, gpu_setting_name, gpu_value)
else:
gpus_env = "CUDA_VISIBLE_DEVICES"
self._set_gpu_env_var(wresources, task, gpus_per_node, gpus_env)
return extra_args

def _get_default_arg(self, gpu_setting_type):
"""Return default setting for the given gpu_setting_type if it exists, else error"""
jassert(
gpu_setting_type in ["option_gpus_per_node", "option_gpus_per_task"],
f"Unrecognized gpu_setting_type {gpu_setting_type}",
)
jassert(
self.default_gpu_args is not None,
"The current MPI runner has no default command line option for setting GPUs",
)
gpu_setting_name = self.default_gpu_args[gpu_setting_type]
jassert(gpu_setting_name is not None, f"No default GPU setting for {gpu_setting_type}")
return gpu_setting_name

def _assign_gpus(self, task, resources, nprocs, nnodes, ppn, ngpus, extra_args, match_procs_to_gpus):
"""Assign GPU resources to slots, limited by ngpus if present.
Expand Down Expand Up @@ -199,7 +215,7 @@ def _assign_gpus(self, task, resources, nprocs, nnodes, ppn, ngpus, extra_args,

elif gpu_setting_type in ["option_gpus_per_node", "option_gpus_per_task"]:
gpu_value = gpus_per_node // ppn if gpu_setting_type == "option_gpus_per_task" else gpus_per_node
gpu_setting_name = self.platform_info.get("gpu_setting_name", self.default_gpu_arg)
gpu_setting_name = self.platform_info.get("gpu_setting_name", self._get_default_arg(gpu_setting_type))
extra_args = self._set_gpu_cli_option(wresources, extra_args, gpu_setting_name, gpu_value)

elif gpu_setting_type == "env":
Expand Down Expand Up @@ -319,7 +335,7 @@ def __init__(self, run_command="mpirun", platform_info=None):
self.arg_nnodes = ("--LIBE_NNODES_ARG_EMPTY",)
self.arg_ppn = ("--ppn", "-ppn")
self.default_mpi_options = None
self.default_gpu_arg = None
self.default_gpu_args = None
self.default_gpu_arg_type = None
self.platform_info = platform_info

Expand All @@ -343,7 +359,7 @@ def __init__(self, run_command="mpirun", platform_info=None):
self.arg_nnodes = ("--LIBE_NNODES_ARG_EMPTY",)
self.arg_ppn = ("-npernode",)
self.default_mpi_options = None
self.default_gpu_arg = None
self.default_gpu_args = None
self.default_gpu_arg_type = None
self.platform_info = platform_info
self.mpi_command = [
Expand Down Expand Up @@ -388,7 +404,7 @@ def __init__(self, run_command="aprun", platform_info=None):
self.arg_nnodes = ("--LIBE_NNODES_ARG_EMPTY",)
self.arg_ppn = ("-N",)
self.default_mpi_options = None
self.default_gpu_arg = None
self.default_gpu_args = None
self.default_gpu_arg_type = None
self.platform_info = platform_info
self.mpi_command = [
Expand All @@ -410,7 +426,7 @@ def __init__(self, run_command="mpiexec", platform_info=None):
self.arg_nnodes = ("--LIBE_NNODES_ARG_EMPTY",)
self.arg_ppn = ("-cores",)
self.default_mpi_options = None
self.default_gpu_arg = None
self.default_gpu_args = None
self.default_gpu_arg_type = None
self.platform_info = platform_info
self.mpi_command = [
Expand All @@ -431,8 +447,9 @@ def __init__(self, run_command="srun", platform_info=None):
self.arg_nnodes = ("-N", "--nodes")
self.arg_ppn = ("--ntasks-per-node",)
self.default_mpi_options = "--exact"
self.default_gpu_arg = "--gpus-per-task"
self.default_gpu_arg_type = "option_gpus_per_task"
self.default_gpu_args = {"option_gpus_per_task": "--gpus-per-task", "option_gpus_per_node": "--gpus-per-node"}

self.platform_info = platform_info
self.mpi_command = [
self.run_command,
Expand All @@ -453,8 +470,8 @@ def __init__(self, run_command="jsrun", platform_info=None):
self.arg_nnodes = ("--LIBE_NNODES_ARG_EMPTY",)
self.arg_ppn = ("-r",)
self.default_mpi_options = None
self.default_gpu_arg = "-g"
self.default_gpu_arg_type = "option_gpus_per_task"
self.default_gpu_args = {"option_gpus_per_task": "-g", "option_gpus_per_node": None}

self.platform_info = platform_info
self.mpi_command = [self.run_command, "-n {num_procs}", "-r {procs_per_node}", "{extra_args}"]
Expand Down
Loading

0 comments on commit fdbd062

Please sign in to comment.