Batch timeseries analysis in slurm #120

ledm · 2023-11-24T15:48:47Z

Closes #118

It's looking like this script is working now.

This PR adds a slurm queue based batch parallel processing of single job timeseries tool.

It's got the following features:

Loads from existing input_yaml files.
Run with a single command
Fully parallelises the slowest part of the analysis_comparison tool: the single job analysis.
Won't submit the same jobID if it's already existing*

However, if two analyses call the same jobID with different suites (ie one has bgc and one has physics), it will only run the first one.

Need to do:

Documentation in README.md
Is it @DrYool proof?

working here

…nto dev_parrallel_ts

valeriupredoi

hey bud, nice and fast turnaround - I am a bit concerned about the actual submission process (call to subprocess, see me comment) - I'd also think you should add job requirements, no? Also, I'd rename it to batch so and so, since it's not quite parallel-parallel 😁

.gitignore

valeriupredoi · 2023-11-24T16:04:32Z

bgcval2/parrellel_timeseries.py

+        else:
+            # Submit job:
+            print('Submitting:', ' '.join(command_txt))
+            command1 = subprocess.Popen(command_txt)


I'd defo encase this in a try/except with except fishing for some key elements in stderr, or just printing the whole stderr to screen; if you don't pipe stderr as out it'll be hidden, and the user won't know why their jobs have not been submitted when they thought they have

setup.py

valeriupredoi · 2023-11-24T16:13:23Z

oh and also - this things really do deserve a test, not thru and thru with SLURM submission, but everything up to that. I can write the test when it's about ready 👍

valeriupredoi

let's make sure the piping is done correctly; also, have you tried this in practice? We don't need any special env to pass to sbatch do we? Like any special environment variables

bgcval2/batch_timeseries.py

Co-authored-by: Valeriu Predoi <[email protected]>

ledm · 2023-11-27T15:39:22Z

let's make sure the piping is done correctly; also, have you tried this in practice? We don't need any special env to pass to sbatch do we? Like any special environment variables

I've been using this for a few days and it works on jasmin. Your amendment to the subprocess also works too.

If batch_timeseries fails you get normal python errors. If it fails inside inside the sbatch script, then you get error messages in the places that we tell it to fail.

valeriupredoi · 2023-11-27T15:41:32Z

yeh that's how we want it to behave, so stdout can be piped to eg a file. Looks good, bud! Let me write a test for it!

ledm · 2023-11-27T15:43:54Z

I'm not ready to merge. Still need to add documentation & maybe get @DrYool to try it.

README.md

ledm · 2023-11-27T16:00:14Z

The next question I have:

Do we want to make this the default behaviour? This would mean that we run this script from inside analysis_compare when the -s flag is absent instead of calling the analysis_timeseries command?

…gcval2 into dev_parrallel_ts

ledm · 2023-11-27T16:09:12Z

Basically, the process for adding a new job, input.yml:

analysis_compare -s -y input.yml: This generates the job download commands, which will run overnight. It also creates an html report, but it breaks if there's no data downloaded yet.
Wait overnight for data to download on mass.
batch_timeseries -y input.yml: This submits the job timeseries onto the processing nodes.
analysis_compare -s -y input.yml: This job creates the html report.
./rsync_to_esmeval.sh: this copies to html to the web visible location on disk.

I suspect that these can be merged into fewer commands!

valeriupredoi · 2023-11-27T16:09:46Z

@ledm I added some test gubbins, make sure to pull or merge so there are no conflictseses

ledm · 2023-11-27T16:43:29Z

The logo in the README points towards a file on the main branch, but of course it's not available yet until this PR is merged.

ledm · 2023-11-28T10:42:39Z

bgcval2/batch_timeseries.py

+    return args
+
+
+def submits_lotus(compare_yml, config_user, dry_run=False):


We're not actually using config_user here, but I'm keeping it for completeness, just in case.

README.md

ledm · 2023-11-28T10:55:42Z

Okay @valeriupredoi, I'm happy with this now.

valeriupredoi · 2023-11-28T11:03:06Z

all good by me too, bud! Go ahead and merge when you good 🍺

ledm added 12 commits November 24, 2023 10:49

working lotus script for single time series

c530c9f

created parrellel_timeseries analysis

5fb37d9

working here.

93a4a1a

Update .gitignore

9a25b7e

working here

testing sbatch

da0f257

working and debuggin.

19465fc

Merge branch 'dev_parrallel_ts' of github.com:valeriupredoi/bgcval2 i…

70f7bee

…nto dev_parrallel_ts

debugs and tests and prints.

b8da78c

Merge remote-tracking branch 'origin/main' into dev_parrallel_ts

89f113a

working fixes

860a6b3

on-going TF runs

b18cb8e

Added end of command print"

7e92e4a

valeriupredoi requested changes Nov 24, 2023

View reviewed changes

renamed parrallel to batch as per decreed by @valeriupredoi

da23923

valeriupredoi requested changes Nov 27, 2023

View reviewed changes

bgcval2/batch_timeseries.py Outdated Show resolved Hide resolved

ledm and others added 3 commits November 27, 2023 15:26

Update bgcval2/batch_timeseries.py

431b54d

Co-authored-by: Valeriu Predoi <[email protected]>

local changes.

1825a3b

added command line arguments for output.

489475b

valeriupredoi approved these changes Nov 27, 2023

View reviewed changes

ledm added 2 commits November 27, 2023 15:55

added documentation.

1c567e9

reverted text deletion.

7780674

ledm commented Nov 27, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

Update README.md

923154b

valeriupredoi added 2 commits November 27, 2023 16:07

add test to new cmd line func

b759b33

add GA call to cmd line func

841eeb5

Merge branch 'dev_parrallel_ts' of https://github.com/valeriupredoi/b…

6a19048

…gcval2 into dev_parrallel_ts

ledm added 2 commits November 27, 2023 16:26

additional doc

07ecc30

Changed logo

7d92af4

ledm changed the title ~~Parrallel timeseries analysis in slurm~~ Batch timeseries analysis in slurm Nov 28, 2023

changed comments

4f82bfd

ledm commented Nov 28, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

Update README.md

31cc3c4

ledm commented Nov 28, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

Update README.md

25b7f9c

ledm commented Nov 28, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

Update README.md

c0aee37

ledm commented Nov 28, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

Update README.md

355c8c4

ledm merged commit b086386 into main Nov 28, 2023
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch timeseries analysis in slurm #120

Batch timeseries analysis in slurm #120

ledm commented Nov 24, 2023 •

edited

Loading

valeriupredoi left a comment

valeriupredoi Nov 24, 2023

valeriupredoi commented Nov 24, 2023

valeriupredoi left a comment

ledm commented Nov 27, 2023

valeriupredoi commented Nov 27, 2023

ledm commented Nov 27, 2023

ledm commented Nov 27, 2023

ledm commented Nov 27, 2023

valeriupredoi commented Nov 27, 2023

ledm commented Nov 27, 2023

ledm Nov 28, 2023

ledm commented Nov 28, 2023

valeriupredoi commented Nov 28, 2023

		return args


		def submits_lotus(compare_yml, config_user, dry_run=False):

Batch timeseries analysis in slurm #120

Batch timeseries analysis in slurm #120

Conversation

ledm commented Nov 24, 2023 • edited Loading

valeriupredoi left a comment

Choose a reason for hiding this comment

valeriupredoi Nov 24, 2023

Choose a reason for hiding this comment

valeriupredoi commented Nov 24, 2023

valeriupredoi left a comment

Choose a reason for hiding this comment

ledm commented Nov 27, 2023

valeriupredoi commented Nov 27, 2023

ledm commented Nov 27, 2023

ledm commented Nov 27, 2023

ledm commented Nov 27, 2023

valeriupredoi commented Nov 27, 2023

ledm commented Nov 27, 2023

ledm Nov 28, 2023

Choose a reason for hiding this comment

ledm commented Nov 28, 2023

valeriupredoi commented Nov 28, 2023

ledm commented Nov 24, 2023 •

edited

Loading