Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exceeding memory for Express_Run387696_StreamHLTMonitor merge #46594

Open
mmusich opened this issue Nov 4, 2024 · 5 comments
Open

Exceeding memory for Express_Run387696_StreamHLTMonitor merge #46594

mmusich opened this issue Nov 4, 2024 · 5 comments

Comments

@mmusich
Copy link
Contributor

mmusich commented Nov 4, 2024

This issue is a mirror of cmsTalk post.

We have an express merge job that exceeds the 2 GB memory limit for the Express_Run387696_StreamHLTMonitor workflow.
Run 387696 is very long (17h:48min, ~ 2750 LS), and such issues could arise for long runs but should be investigated.

The tarball can be found here:

/eos/user/c/cmst0/public/PausedJobs/Run2024J/maxMemory/Express_Run387696_StreamHLTMonitor

Marco (as ORM)

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 4, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 4, 2024

A new Issue was created by @mmusich.

@Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@makortel
Copy link
Contributor

makortel commented Nov 6, 2024

assign dqm

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 6, 2024

New categories assigned: dqm

@antoniovagnerini,@rseidita you have been requested to review this Pull request/Issue and eventually sign? Thanks

@makortel
Copy link
Contributor

makortel commented Nov 6, 2024

Based on the log the job seems to be effectively DQM harvesting. The job opened 1192 files before being signaled to shut down.

We've seen before the harvesting memory requirements scale with the number of input files (e.g. #38976, https://cms-talk.web.cern.ch/t/re-2018-replay-for-pps-pcl-test-dqm-expresmergewrite-memory-too-high/6114/8)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants