-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault in TestDQMOnlineClient-visualization #42093
Comments
assign core FYI @cms-sw/dqm-l2 |
A new Issue was created by @makortel Matti Kortelainen. @Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
New categories assigned: core @Dr15Jones,@smuzaffar,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks |
We updated TBB to 2021.9.0 in CMSSW_13_2_X_2023-06-20-2300 (with cms-sw/cmsdist#8553), it may be related. |
Another one in CMSSW_13_3_X_2023-09-05-1100 on slc7_amd64_gcc11
|
Another one in CMSSW_14_1_X_2024-02-21-2300 on
|
Hello, new segfault occurrence in CMSSW_14_1_X_2024-07-17-2300:
|
New occurance: CMSSW_14_1_X_2024-08-12-1100
|
Another occurance: el8_amd64_gcc12/CMSSW_14_1_MULTIARCHS_X_2024-08-18-0000
Doesn't reproduce when ran locally |
#42093 (comment) was run on a cmsbuild926 (16 cores AlmaLinux9 node). Any idea why it is using 26 threads? I see that when this tests was running/failed then system log contains message [a]. Note that scram was running 16 unit tests in parallel and then if some unit tests starting using multiple theads then that can overload system and can hit system limits. Is there any way to configure this unit test to not run multi-threaded ? [a]
|
Looking at the traceback, the threads are
|
Two new occurances: CMSSW_14_2_DEVEL_X_2024-10-03-2300, CMSSW_14_2_GEANT4_X_2024-10-02-2300. |
On an unrelated note: the DBG_X log file for DQM/Integration unit tests is over 11M lines long (compared to 26k+ lines for non-DBG builds), seems a bit too verbose to me. |
I saw too many |
Still happens: el8_amd64_gcc12/CMSSW_14_2_NONLTO_X_2024-11-12-2300 |
Crashes in #46685 (comment)
Also here threads 6 and 1 show
|
The crash itself seems to happen always in
(but I have hard time understanding what exactly that means). @smuzaffar Maybe we could consider updating oneTBB once |
Occurred in #46769 (comment)
|
Is it possible an exception went off and we aren't properly catching it? |
The test
TestDQMOnlineClient-visualization
crashed in CMSSW_13_2_NONLTO_X_2023-06-24-1100 on el8_amd64_gcc11 withhttps://cmssdt.cern.ch/SDT/cgi-bin/logreader/el8_amd64_gcc11/CMSSW_13_2_NONLTO_X_2023-06-24-1100/unitTestLogs/DQM/Integration#/
The text was updated successfully, but these errors were encountered: