Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for OSG #1

Open
2 of 3 tasks
alee opened this issue May 27, 2022 · 2 comments
Open
2 of 3 tasks

support for OSG #1

alee opened this issue May 27, 2022 · 2 comments

Comments

@alee
Copy link

alee commented May 27, 2022

@alee
Copy link
Author

alee commented May 27, 2022

behaviorspace experiment should be split into multiple pieces instead of one enormous parameter sweep

https://support.opensciencegrid.org/support/solutions/articles/5000632058-computation-on-the-open-science-pool

@alee
Copy link
Author

alee commented Jul 7, 2022

getting an error for disk usage despite requesting:

# Job requirements - make sure we're running on a Singularity enabled node with enough resources to execute our code
Requirements = HAS_SINGULARITY == True && OSG_HOST_KERNEL_VERSION >= 31000
request_cpus = 2
request_memory = 16 GB
request_disk = 50 GB

Error log:

007 (22905708.000.000) 2022-07-07 00:39:29 Shadow exception!
        Error from slot1_3@GP-ARGO-astate-backfill-a7a8bef66d28: disk usage exceeded request_disk
        0  -  Run Bytes Sent By Job
        1510  -  Run Bytes Received By Job

After requesting 500GB the job was still held:

...
012 (22909717.000.000) 2022-07-07 05:50:42 Job was held.
        Job in status 2 put on hold by SYSTEM_PERIODIC_HOLD due to disk usage 204445924.
        Code 26 Subcode 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant