Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DLRM 99.9 performance and accuracy runs getting stuck on Xeon Icelake CPU #12

Open
lvaidya2910 opened this issue Nov 28, 2022 · 0 comments

Comments

@lvaidya2910
Copy link

1157-0  : Complete load query samples !!
1158-3  : Complete load query samples !!
1158-2  : Complete load query samples !!
1158-1  : Complete load query samples !!
1158-4  : Complete load query samples !!
1158-0  : Complete load query samples !!
1157-2  : Complete load query samples !!
1157-1  : Complete load query samples !!
1157-4  : Complete load query samples !!
1157-3  : Complete load query samples !!

The DLRM 99.9 runs are getting stuck after this output. It is stuck at this spot for more than 4-5 hours; I am not sure what is causing this issue. The runs won't produce anything in the mlperf_log_summary.txt file.

Also, I have attached the performance run mlperf_log_detail.txt file if it can help us figure out what's the issue.
mlperf_log_detail.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant