Skip to content

Conversation

@tat-ohmura
Copy link
Contributor

This is part of our effort to integrate PSI/J into Open OnDemand (URL: https://openondemand.org/), a web portal for HPC systems.
Currently, Open OnDemand maintains an adapter (backend) for each scheduler, leading to increased maintenance costs.
We are planning to utilize PSI/J to abstract different schedulers and create a single adapter for all schedulers supported by PSI/J.
Open OnDemand requires APIs for hold, release, and info operations, in addition to job submission and deletion already supported by PSI/J. We have implemented these methods as follows:

  • hold: Suspend a pending job
  • release: Resume a suspended job
  • info: Query job information
    Once this PR is approved, we will update the documentation accordingly. We would appreciate any feedback.

@hategan
Copy link
Collaborator

hategan commented May 28, 2025

Hi and thank you for the PR. I am adding @andre-merzky to the discussion.

This is indeed something that makes good sense and I believe it to be within the scope of PSI/J. Holding/release were not part of the initial design because we mostly focused on what we perceived automation to be and that is workflows. As far as I can tell without doing a full review, but the code looks clean and nicely follows the existing codebase in style and organization.

I think it is likely that info/JobInfo would need a bit of discussion due to some overlapping functionality with the existing code. Specifically, the walltime, various state transition times, the node list, etc. are already available in other places, although not as nicely aggregated. We would also want to ensure that a potential info() can be somewhat uniformly implemented over all schedulers.

So I'll add a few research tasks here that we should probably work out. Some likely have obvious answers, but it's probably a good idea to have them listed out anyway.

  • Can we reasonably implement hold/release over the entire range of batch schedulers?
  • For executors where hold/release don't apply (e.g., local), do we have a graceful way of saying "Not implemented/does not apply"?
  • Should hold/release be added to the specification (https://github.com/ExaWorks/job-api-spec)?
  • Can we aggregate JobInfo-like information from existing PSI/J sources and is the available information sufficient to satisfy OOD's requirements?
    • If anything is missing, can it be reasonably added?
  • Would JobInfo be a better option than some of the more ad-hoc mechanisms in PSI/J for reporting job status?

Let's try to answer these and go from there.

@tat-ohmura
Copy link
Contributor Author

Thank you for your prompt response. I'm happy to be able to discuss this with you.

I believe we need to consider the use cases for how hold and release can be utilized within workflow automation. At the very least, hold and release operations are necessary for OOD, but I would also like to investigate whether other workflow tools have similar use cases for these functions.

I would also like to discuss info/JobInfo. Monitoring not only job statuses but also real-time job information, such as CPU usage, can be useful for verifying job health. Since some of this information overlaps, I think we need to organize it properly. Regarding info, in cases where monitoring needs to be done separately from the job submission process, the current system does not seem to provide all the necessary data. Since job schedulers retain submission-time information, I was considering them as a way to query and update job details. At least for OOD, this was essential.

I will look into the research tasks raised.
I appreciate your support and look forward to continued collaboration.

@hategan
Copy link
Collaborator

hategan commented May 30, 2025

[...]

I believe we need to consider the use cases for how hold and release can be utilized within workflow automation. At the very least, hold and release operations are necessary for OOD, but I would also like to investigate whether other workflow tools have similar use cases for these functions.

@andre-merzky can weigh in on this, but I think that hold/release are a reasonable part of interacting with a scheduler and should be included. We'll need to do some work on our end beyond this PR, but that can be done separately.

I would also like to discuss info/JobInfo. Monitoring not only job statuses but also real-time job information, such as CPU usage, can be useful for verifying job health. Since some of this information overlaps, I think we need to organize it properly. Regarding info, in cases where monitoring needs to be done separately from the job submission process, the current system does not seem to provide all the necessary data. Since job schedulers retain submission-time information, I was considering them as a way to query and update job details. At least for OOD, this was essential.

These are indeed useful. However, real-time CPU usage would be beyond the scope of PSI/J. There are two reasons for this:

  1. There is no simple way to get this information on arbitrary machines
  2. Adding a mechanism to deal with live information coming from running jobs involves complexity that would make PSI/J difficult to maintain. This is quite important given that getting financial support for infrastructure projects like PSI/J is difficult. It is also important because PSI/J depends on contributions from users with access to machines that we do not have access to (NQSV is a perfect example), and we want to ensure that we do not make these contributions too complex.

Perhaps we could start by listing exactly what information is needed by OOD, and then we can see if there is a way to implement a solution that can be layered on top of PSI/J rather than within.

I will look into the research tasks raised. I appreciate your support and look forward to continued collaboration.

And thank you for your input and contributions.

@tat-ohmura
Copy link
Contributor Author

My sincere apologies for the delay in responding to your comments.
I’m going to respond to the research task.

  • Can we reasonably implement hold/release over the entire range of batch schedulers?
    I investigated the availability of hold/release functionality for the job schedulers currently supported by PSI/J.
    Some job schedulers support hold/release functionality, while others do not.
Job Scheduler Hold Command Release Command Supported
cobalt Not supported
LSF bstop bresume Supported
PBS (OpenPBS / PBS Pro / TORQUE) qhold qrls Supported
Slurm scontrol hold scontrol release Supported
NQSV (NEC Network Queuing System V) qhold / qsig qrls / qsig Supported
Flux Not supported
RADICAL Pilot system Not supported
  • For executors where hold/release don't apply (e.g., local), do we have a graceful way of saying "Not implemented/does not apply"?

Yes. Because there are schedulers where hold/release is not supported.

I’m about to submit a pull request. I will add following specifications.

  • Can we aggregate JobInfo-like information from existing PSI/J sources and is the available information sufficient to satisfy OOD's requirements?

When displaying the job list in OOD, the following information is required.
"None" indicates that the data cannot currently be retrieved from the PSI/J source.

OOD's Attributes PSI/J sources
Job ID Job.native_id
Job Name JobSpec.name
User None
Account JobAttributes.account or JobAttribute.project_name
Partition JobAttributes.queue
State Job.status
Total CPUs ResourceSpecV1.cpu_cores_per_process * process_count
CPU Time None
Time Limit JobAttributes.duration
Time Used now - JobStatus.time (ACTIVE state only)
Start Time JobStatus.time (ACTIVE state only)
  • If anything is missing, can it be reasonably added?

It’s possible for the user by adding user (owner) to JobAttributes.
Some job schedulers support submitting jobs under a different user (e.g., Slurm’s -u <user_name> option), so I think it’s reasonable to include it.
CPU Time is not feasible, as we cannot represent the CPU time of a running job.

  • Would JobInfo be a better option than some of the more ad-hoc mechanisms in PSI/J for reporting job status?

Yes. It may be outside the scope of PSI/J, but since real-time information can reference values at execution time, I believe it's a better option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants