Core-sharded-on-disk (CSOD) files are generally analyzed with thread-sharded scheduler mode, which is still core-sharded for analysis tools: it just doesn't try to re-schedule the already-scheduled file.
PR #7042 is making it a fatal error to re-schedule a CSOD file as that almost always indicates user error. We'd like to improve this in 2 ways:
- Detect in the scheduler for uses beyond analyzers
- Automatically run thread-sharded for a core-sharded-preferred tool on a CSOD trace: instead of the user hitting the fatal error and having to re-run with
-no_core_sharded.
These are tricky because of the code structure and what is known when: the analyzer_multi layer that sets the scheduler params doesn't know whether it's a CSOD file as it doesn't open up any trace files and there's no metadata.
I tried having the scheduler return a special error code and having analyzer.cpp detect and re-initialize a brand-new scheduler: but this is messy as analyzer needs to know things only in analyzer_multi, and the scheduler error point is not clear: code refactoring is required to get it early.
One solution might be adding a metadata file: that could help other tasks too.
Core-sharded-on-disk (CSOD) files are generally analyzed with thread-sharded scheduler mode, which is still core-sharded for analysis tools: it just doesn't try to re-schedule the already-scheduled file.
PR #7042 is making it a fatal error to re-schedule a CSOD file as that almost always indicates user error. We'd like to improve this in 2 ways:
-no_core_sharded.These are tricky because of the code structure and what is known when: the analyzer_multi layer that sets the scheduler params doesn't know whether it's a CSOD file as it doesn't open up any trace files and there's no metadata.
I tried having the scheduler return a special error code and having analyzer.cpp detect and re-initialize a brand-new scheduler: but this is messy as analyzer needs to know things only in analyzer_multi, and the scheduler error point is not clear: code refactoring is required to get it early.
One solution might be adding a metadata file: that could help other tasks too.