Add Slurm API logic #50

djperrefort · 2023-11-19T23:44:11Z

There are placeholder functions in the allocations/tasks.py module for interfacing with Slurm, but the underlying logic is still missing.

The text was updated successfully, but these errors were encountered:

Comeani · 2023-12-11T21:08:53Z

To give us an idea of what functions we'll need, the table below provides a set of functionality that the bank uses to interact with SLURM, and the corresponding command line tools/syntax we've been wrapping to accomplish that:

Bank Functionality	SLURM Tool Command
Gather Cluster Names	`sacctmgr show clusters format=Cluster –noheader –parsable2`
Gather Partition Names	`sinfo -M {cluster} -o “%P” –noheader`
Check if a SLURM account exists	`sacctmgr -n show assoc account={account_name}`
Get the locked state of a cluster	`sacctmgr -n -P show assoc account={account_name} format=GrpTresRunMins clusters={cluster}, check whether “cpu=0” is inside the output`
Set the locked state of a cluster	`sacctmgr -i modify account where account={account_name} cluster={cluster} set GrpTresRunMins=cpu=0 or -1` for CPUs and `sacctmgr -i modify account where account={account_name} cluster={cluster} set GrpTresRunMins=gres/gpu=0 or -1` for GPUs
Get usage per user on a cluster	`sreport cluster AccountUtilizationByUser -Pn -T Billing -t Hours cluster={cluster} Account={account_name} start={start.strftime(‘%Y-%m-%d’)} end={end.strftime(‘%Y-%m-%d’)} format=Proper,Used`
Get the total usage on a cluster	Same command as getting the user usage, just sum over the users

I haven't spent a significant amount of time looking at the API specification yet to verify how much of these are already available to us in that format.

Some questions/comments that came up while I was looking into this:

If we are already going to be setting GrpTresRunMins for locking/unlocking, we should consider just using GrpTresRunMins to serve as the SUs remaining and let SLURM keep track, instead of an adhoc account status update.
Sreport only gives usage for users with usage over 1 hour of TRES usage, they are not listed otherwise. The user list may only be a subset of the following:

[root@moss ~]# sacctmgr show association Account=sam where cluster=smp format=User
      User 
---------- 
           
  bmooreii 
     chx33 
     djp81 
  fangping 
      fis7 
   gnowmik 
      jar7 
     ketan 
   kimwong 
    leb140 
    mak189 
     nlc60 
    sak236 
    shs159 
     yak73

We may need to get the full user list associated with a SLURM allocation separately like the above, and attribute usage only to the users who actually have it in a "usage per user" overview of the allocations in keystone.

In some video content I found demoing the API, Nick Ihli explains that the API is intended for distributed systems, not websites, so it API calls don't use HTTPS by default and should correspondingly be TLS wrapped.
In the same video, Nick explains that much of the functionality around interacting with associations and job accounting is available in the API, but that roll-up report generation (sreport) is not yet present. The video is from Jan 27, 2022 so it's possible they've been working on this since then (I can't remember if he explicitly mentioned their status on this at SC) We may also be able to hack together the info from individual job accounting across a time duration for all users in a specific association.

Comeani · 2023-12-11T21:13:31Z

Slides from the most recent SLURM User Group meeting talk on the state of the REST API: https://slurm.schedmd.com/SLUG23/REST-API-SLUG23.pdf
(no mention of sreport ☹️)

chnixi · 2023-12-12T17:27:44Z

Slides from the most recent SLURM User Group meeting talk on the state of the REST API: https://slurm.schedmd.com/SLUG23/REST-API-SLUG23.pdf (no mention of sreport ☹️)

We could just use sacctmgr show to display the information instead of sreport?

Comeani · 2023-12-12T18:57:34Z

The sacctmgr functionality can show you all the information about associations, the users within them, and any properties/limits that have been imposed. It does not have any interaction with the job accounting.

A possible workaround to use the API to gather info equivalent to sacct which can show you individual job accounting, which we could build custom logic around to sum across a user in the duration of their groups proposal:

[nlc60@moss sam_scripts] main : sacct -p -M smp -X -u chx33 -S 2023-04-26 --format=AllocTres | grep billing
billing=9,cpu=12,mem=48216M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=3,cpu=4,mem=1G,node=1|
billing=9,cpu=12,mem=1G,node=1|
billing=9,cpu=12,mem=1G,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=1,cpu=2,mem=1G,node=1|
billing=25,cpu=32,mem=1G,node=1|
billing=25,cpu=32,mem=1G,node=1|
billing=25,cpu=32,mem=1G,node=1|
billing=25,cpu=32,mem=1G,node=1|
billing=25,cpu=32,mem=1G,node=1|
billing=1,cpu=2,mem=12G,node=1|
billing=25,cpu=32,mem=128576M,node=1|
billing=25,cpu=32,mem=128576M,node=1|
billing=3,cpu=4,mem=16072M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=11,cpu=14,mem=56252M,node=1|
billing=12,cpu=15,mem=60270M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=1G,node=1|
billing=25,cpu=32,mem=1G,node=1|
billing=19,cpu=24,mem=1G,node=1|
billing=12,cpu=16,mem=1G,node=1|
billing=19,cpu=24,mem=96432M,node=1|
billing=25,cpu=32,mem=128576M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=15,mem=60270M,node=1|
billing=14,cpu=18,mem=72324M,node=1|
billing=12,cpu=15,mem=60270M,node=1|
billing=14,cpu=18,mem=72324M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=15,mem=60270M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=19,cpu=24,mem=96432M,node=1|
billing=25,cpu=32,mem=128576M,node=1|
billing=12,cpu=16,mem=1G,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=25,cpu=32,mem=128576M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=8,mem=120G,node=1|
billing=3,cpu=4,mem=1G,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=8,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=25,cpu=32,mem=128576M,node=1|
billing=25,cpu=32,mem=128576M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=1,cpu=2,mem=8036M,node=1|
billing=3,cpu=4,mem=16072M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=3,cpu=4,mem=16072M,node=1|
billing=3,cpu=4,mem=16072M,node=1|
billing=3,cpu=4,mem=16072M,node=1|
billing=3,cpu=4,mem=16072M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=3,cpu=4,mem=16072M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=3,cpu=4,mem=16072M,node=1|
billing=6,cpu=8,mem=32144M,node=1|
billing=1,cpu=2,mem=8036M,node=1|
billing=3,cpu=4,mem=16072M,node=1|
billing=3,cpu=4,mem=16072M,node=1|
billing=3,cpu=4,mem=16072M,node=1|
billing=51,cpu=64,mem=257152M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=3,cpu=4,mem=1G,node=1|
billing=3,cpu=4,mem=1G,node=1|
billing=3,cpu=4,mem=1G,node=1|
billing=3,cpu=4,mem=1G,node=1|
billing=3,cpu=4,mem=1G,node=1|
billing=3,cpu=4,mem=1G,node=1|
billing=3,cpu=4,mem=1G,node=1|
billing=25,cpu=32,mem=1G,node=1|
billing=6,cpu=8,mem=1G,node=1|
billing=6,cpu=8,mem=1G,node=1|
billing=25,cpu=32,mem=128576M,node=1|
billing=3,cpu=4,mem=16072M,node=1|
billing=1,cpu=2,mem=8036M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=4,cpu=6,mem=1G,node=1|
billing=19,cpu=24,mem=1G,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=51,cpu=64,mem=1G,node=1|
billing=3,cpu=4,mem=1G,node=1|
billing=3,cpu=4,mem=16072M,node=1|
billing=3,cpu=4,mem=16072M,node=1|
billing=9,cpu=12,mem=48216M,node=1|
billing=3,cpu=4,mem=1G,node=1|
billing=3,cpu=4,mem=1G,node=1|
billing=3,cpu=4,mem=1G,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=22,cpu=28,mem=112504M,node=1|
billing=22,cpu=28,mem=112504M,node=1|
billing=22,cpu=28,mem=112504M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=12,cpu=16,mem=64288M,node=1|
billing=25,cpu=32,mem=128576M,node=1|
billing=38,cpu=48,mem=192864M,node=1|
billing=51,cpu=64,mem=257152M,node=1|
billing=26,cpu=32,mem=256G,node=1|
billing=26,cpu=12,mem=256G,node=1|
billing=48,cpu=32,mem=1T,node=1|
billing=25,cpu=32,mem=128576M,node=1|
billing=25,cpu=32,mem=128576M,node=1|
billing=1,cpu=2,mem=8036M,node=1|
billing=1,cpu=2,mem=8036M,node=1|
billing=1,cpu=2,mem=8036M,node=1|
billing=26,cpu=12,mem=256G,node=1|
billing=26,cpu=12,mem=256G,node=1|
billing=26,cpu=12,mem=256G,node=1|
billing=26,cpu=12,mem=256G,node=1|
billing=26,cpu=12,mem=256G,node=1|
billing=26,cpu=12,mem=256G,node=1|
billing=6,cpu=8,mem=12G,node=1|
billing=26,cpu=12,mem=256G,node=1|
billing=4,cpu=6,mem=1G,node=1|

I believe we'd have to sum the billing number here to get usage in TRES hours since the start of the proposal (although this could include cancelled jobs, I haven't verified against the sreport output yet), and them sum across users to get the cluster total for the group.

The ideal would be SchedMD incorporating the rollup stats into the API, we'd simply display the Used column, and sum them to the cluster total.

[nlc60@moss ~] : sreport cluster AccountUtilizationByUser -T Billing -t Hours cluster=smp Account=sam start=2023-04-26
--------------------------------------------------------------------------------
Cluster/Account/User Utilization 2023-04-26T00:00:00 - 2023-12-11T23:59:59 (19875600 secs)
Usage reported in TRES Hours
--------------------------------------------------------------------------------
  Cluster         Account     Login     Proper Name      TRES Name     Used 
--------- --------------- --------- --------------- -------------- -------- 
      smp             sam                                  billing     1212 
      smp             sam     chx33           chx33        billing      816 
      smp             sam     djp81           djp81        billing        0 
      smp             sam  fangping        fangping        billing        4 
      smp             sam      jar7            jar7        billing        0 
      smp             sam   kimwong         kimwong        billing       41 
      smp             sam    leb140          leb140        billing      201 
      smp             sam     nlc60           nlc60        billing        0 
      smp             sam    sak236          sak236        billing        0 
      smp             sam     yak73           yak73        billing      150

Comeani · 2023-12-12T19:14:41Z

greping and awking my way to a sum of those values after billing= does not yield 816 billing hours... We'd need to do some further digging to see how sreport comes to it's total.

djperrefort · 2024-01-16T15:14:37Z

If we use trackable resource limits to enforce account usage limits, the following equation will give the new resource limit when a proposal expires:

new_limit = old_limit - min(0, proposal_sus + historical_usage + initial_usage - total_usage)

Where:

proposal_sus is the number of resources awarded by the expiring proposal
historical_usage is the total number of su's used under previous (already expired) proposals
initial_usage is the number of resources used by the account before the application was deployed
total_usage is the total resources used by the account

djperrefort · 2024-01-23T15:12:54Z

Blocked by:

Comeani · 2024-01-23T17:11:58Z

Bank Functionality	SLURM Tool Command	Corresponding Rest API endpoint
Gather Cluster Names	`sacctmgr show clusters format=Cluster –noheader –parsable2`	`/slurmdb/v0.0.38/clusters`
Check if a SLURM account exists	`sacctmgr -n show assoc account={account_name}`	`/slurmdb/v0.0.40/associations`
Get the GrpTresRunMins limit for an account on a cluster	`sacctmgr -n -P show assoc account={account_name} format=GrpTresRunMins clusters={cluster}, check whether “cpu=0” is inside the output`	`/slurmdb/v0.0.38/associations`
Set the locked state of a cluster	`sacctmgr -i modify account where account={account_name} cluster={cluster} set GrpTresRunMins=cpu=0 or -1` for CPUs and `sacctmgr -i modify account where account={account_name} cluster={cluster} set GrpTresRunMins=gres/gpu=0 or -1` for GPUs	`/slurmdb/v0.0.38/associations`
Get usage per user on a cluster	`sreport cluster AccountUtilizationByUser -Pn -T Billing -t Hours cluster={cluster} Account={account_name} start={start.strftime(‘%Y-%m-%d’)} end={end.strftime(‘%Y-%m-%d’)} format=Proper,Used`	TODO: see if associations endpoint can provide this somewhere with non-null usage (teach?)
Get the total usage on a cluster	Same command as getting the user usage, just sum over the users	TODO: see if associations endpoint can provide this somewhere with non-null usage (teach?)

djperrefort · 2024-02-13T17:05:53Z

Looks like the Slurm API isn't mature enough for our needs yet. See #140 for a temporary workaround. Moving this to backlog until the API is further along.

djperrefort mentioned this issue Nov 25, 2023

Refine allocation endpoint permissions #58

Closed

djperrefort assigned Comeani Jan 30, 2024

djperrefort mentioned this issue Feb 13, 2024

Add Slurm cmd wrappers #140

Closed

djperrefort unassigned Comeani Feb 13, 2024

Comeani added the enhancement New feature or request label Mar 5, 2024

djperrefort removed the enhancement New feature or request label Mar 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Slurm API logic #50

Add Slurm API logic #50

djperrefort commented Nov 19, 2023

Comeani commented Dec 11, 2023 •

edited

Loading

Comeani commented Dec 11, 2023 •

edited

Loading

chnixi commented Dec 12, 2023

Comeani commented Dec 12, 2023 •

edited

Loading

Comeani commented Dec 12, 2023

djperrefort commented Jan 16, 2024

djperrefort commented Jan 23, 2024

Comeani commented Jan 23, 2024

djperrefort commented Feb 13, 2024

Add Slurm API logic #50

Add Slurm API logic #50

Comments

djperrefort commented Nov 19, 2023

Comeani commented Dec 11, 2023 • edited Loading

Comeani commented Dec 11, 2023 • edited Loading

chnixi commented Dec 12, 2023

Comeani commented Dec 12, 2023 • edited Loading

Comeani commented Dec 12, 2023

djperrefort commented Jan 16, 2024

djperrefort commented Jan 23, 2024

Comeani commented Jan 23, 2024

djperrefort commented Feb 13, 2024

Comeani commented Dec 11, 2023 •

edited

Loading

Comeani commented Dec 11, 2023 •

edited

Loading

Comeani commented Dec 12, 2023 •

edited

Loading