Skip to content

Conversation

@aerorahul
Copy link
Contributor

@aerorahul aerorahul commented Dec 23, 2025

Description

This PR:

  • consolidates build_compute.sh into build_all.sh to allow the user to choose where to build. -c option allows the script to submit the build jobs to the BQS.

When build_all.sh is invoked as before, previous behavior of build_all.sh is retained -- i.e. build is performed on the current login node.

In addition, this PR updates the use of build_all.sh in the rest of the repository.

If a user wishes to change the options for building the various components, they can do so via sorc/build_opts.yaml. This includes updating cores used for building as well as other options.

The build progress is presented in a tabular form that gets updated every 60s.

Building on head node as requested ...
------------------------------------------------
| System             | Status     |
------------------------------------------------
| gsi_enkf           | RUNNING    |
| gcafs_model        | RUNNING    |
| nexus              | PENDING    |
| gfs_model          | PENDING    |
| gfs_utils          | PENDING    |
| gefs_model         | PENDING    |
| gefs_ww3_prepost   | RUNNING    |
| gdas               | PENDING    |
| ufs_utils          | PENDING    |
| gfs_ww3prepost     | PENDING    |
| gsi_utils          | PENDING    |
| gsi_monitor        | PENDING    |
| sfs_model          | PENDING    |
| upp                | PENDING    |
------------------------------------------------

Type of change

  • New feature (adds functionality)
  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this change expected to change outputs (e.g. value changes to existing outputs, new files stored in COM, files removed from COM, filename changes, additions/subtractions to archives)? NO
  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? YES (but the feature is not ready for general use as yet)
  • Does this change require an update to any of the following submodules? NO

How has this been tested?

Build was locally on Gaea C6
No experiments were run as the behavior of experiments is not expected to change with this option.
CI should be run to ensure automated builds still go through

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • Any new scripts have been added to the .github/CODEOWNERS file with owners
  • I have made corresponding changes to the system documentation if necessary

@DavidHuber-NOAA
Copy link
Contributor

DavidHuber-NOAA commented Dec 24, 2025

All tests passed on C6, including compute-node builds. I will note that I will not be surprised if we periodically see out of memory errors from C6 when using build_compute.sh.

@DavidHuber-NOAA
Copy link
Contributor

Oops. Commented on the wrong PR.

@TravisElless-NOAA
Copy link
Contributor

Tried running using the -b option from this branch on Ursa for gdas, gsi, and gfs builds (i.e.,./build_compute.sh -A fv3-cpu -b gdas gsi gfs). I get the following output and associated errors when I do:

Sourcing global-workflow modules ...
Running "module reset". Resetting modules to system default. The following $MODULEPATH directories have been removed: None
Generating build.xml for building global-workflow programs on compute nodes ...
Building on head node as requested ...
Launching build command:  >  2>&1
bash: -c: line 1: syntax error near unexpected token `2'
bash: -c: line 1: ` >  2>&1 &'
Launching build command:  >  2>&1
bash: -c: line 1: syntax error near unexpected token `2'
bash: -c: line 1: ` >  2>&1 &'
Launching build command:  >  2>&1
bash: -c: line 1: syntax error near unexpected token `2'
bash: -c: line 1: ` >  2>&1 &'
Launching build command:  >  2>&1
bash: -c: line 1: syntax error near unexpected token `2'
bash: -c: line 1: ` >  2>&1 &'

@aerorahul
Copy link
Contributor Author

Tried running using the -b option from this branch on Ursa for gdas, gsi, and gfs builds (i.e.,./build_compute.sh -A fv3-cpu -b gdas gsi gfs). I get the following output and associated errors when I do:

Sourcing global-workflow modules ...
Running "module reset". Resetting modules to system default. The following $MODULEPATH directories have been removed: None
Generating build.xml for building global-workflow programs on compute nodes ...
Building on head node as requested ...
Launching build command:  >  2>&1
bash: -c: line 1: syntax error near unexpected token `2'
bash: -c: line 1: ` >  2>&1 &'

@TravisElless-NOAA
Apologies for rushing this commit before going on leave. I have fixed the errors.

Also a question to other reviewers; if this approach is acceptable, do you wish me to replace existing build_all.sh with this build_compute.sh and update the documentation on RTD?

@TravisElless-NOAA
Copy link
Contributor

I tested the builds using 2ee389b and when invoking the -b option, it succeeded for gsi and gfs applications but it appears to hang when trying to build gdas. I haven't tried other system builds using this yet. Below is what I got when trying to build gdas.

[Travis.J.Elless@ufe01 sorc]$ ./build_compute.sh -A fv3-cpu -b gdas
Sourcing global-workflow modules ...
Running "module reset". Resetting modules to system default. The following $MODULEPATH directories have been removed: None
Generating build.xml for building global-workflow programs ...
Building on head node as requested ...
Build for upp started with PID 838221, using 8 cores.
Build for gsi_utils started with PID 838222, using 6 cores.
Build for gfs_utils started with PID 838223, using 6 cores.
Waiting for builds to complete. Current cores in use: 20/20
BUILD SUCCESS: Build for gsi_utils completed successfully.
BUILD SUCCESS: Build for gfs_utils completed successfully.
Build for gsi_monitor started with PID 846803, using 4 cores.
Waiting for builds to complete. Current cores in use: 12/20
BUILD SUCCESS: Build for upp completed successfully.
Build for ufs_utils started with PID 850330, using 8 cores.
BUILD SUCCESS: Build for gsi_monitor completed successfully.
Waiting for builds to complete. Current cores in use: 8/20
BUILD SUCCESS: Build for ufs_utils completed successfully.
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20
Waiting for builds to complete. Current cores in use: 0/20

@TravisElless-NOAA
Copy link
Contributor

gdas successfully builds on Ursa now when using 8084876

DavidHuber-NOAA
DavidHuber-NOAA previously approved these changes Jan 9, 2026
Copy link
Contributor

@DavidHuber-NOAA DavidHuber-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple suggested comments.

@emcbot emcbot added CI-Ursa-Ready **CM use only** PR is ready for CI testing on Ursa CI-Ursa-Building **Bot use only** CI testing is cloning/building on Ursa CI-Ursa-Running **Bot use only** CI testing on Ursa for this PR is in-progress and removed CI-Ursa-Ready **CM use only** PR is ready for CI testing on Ursa CI-Ursa-Building **Bot use only** CI testing is cloning/building on Ursa labels Jan 15, 2026
@aerorahul
Copy link
Contributor Author

Build on Ursa have completed in the CI.
From the gitlab ci stdout

-----------------------------------
| System             | Status     |
-----------------------------------
| upp                | SUCCEEDED  |
| gsi_utils          | SUCCEEDED  |
| gefs_ww3_prepost   | SUCCEEDED  |
| ufs_utils          | SUCCEEDED  |
| gcafs_model        | SUCCEEDED  |
| gfs_ww3prepost     | SUCCEEDED  |
| gdas               | SUCCEEDED  |
| gsi_enkf           | SUCCEEDED  |
| gfs_utils          | SUCCEEDED  |
| sfs_model          | SUCCEEDED  |
| gefs_model         | SUCCEEDED  |
| gsi_monitor        | SUCCEEDED  |
| nexus              | SUCCEEDED  |
| gfs_model          | SUCCEEDED  |
-----------------------------------
All builds completed successfully!

@emcbot emcbot added CI-Ursa-Passed **Bot use only** CI testing on Ursa for this PR has completed successfully and removed CI-Ursa-Running **Bot use only** CI testing on Ursa for this PR is in-progress labels Jan 15, 2026
@DavidHuber-NOAA
Copy link
Contributor

@aerorahul can you merge and resolve conflicts? After that, I think this is good to merge.

@DavidHuber-NOAA
Copy link
Contributor

I reran builds on C6 and verified gaea61 and gaea67 were not used. Builds ran to completion and the status was maintained throughout the build process. Merging.

@DavidHuber-NOAA DavidHuber-NOAA merged commit 7fae1ac into NOAA-EMC:develop Jan 16, 2026
6 checks passed
@aerorahul aerorahul deleted the feature/build_compute_or_head branch January 16, 2026 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI-Ursa-Passed **Bot use only** CI testing on Ursa for this PR has completed successfully

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants