Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for understanding performance #145

Open
bernhold opened this issue Oct 8, 2021 · 1 comment
Open

Better support for understanding performance #145

bernhold opened this issue Oct 8, 2021 · 1 comment

Comments

@bernhold
Copy link
Contributor

bernhold commented Oct 8, 2021

Initiated by Mark Cianciosa on 2021-10-08 ORNL-AToM call.

  • Let's make it easy to get trace information out of IPS so users can understand component runtimes, look for bottlenecks, etc.
  • Let's provide simple job-end summaries, such as min/avg/max/sd of run time for each component, etc.
  • Let's make it possible to understand resource utilization throughout a job
  • Let's provide simple job-end summaries of resource utilization
@markcoletti
Copy link

Related to this, there needs to be more logging added to inform the users about what's going on. I've noticed in struggling to get the tutorials to run that things ... just fail. Sure, there's a stack trace, but it would've been helpful if --verbose and even --debug toggled additional logging to inform me where in the initialization process things are failing. Because now I'm going to have to step through ips.py with a debugger to understand where things are going off the rails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants