-
Notifications
You must be signed in to change notification settings - Fork 269
Model phi performance fix #1985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@kkoryun could you add here also the command lines you tested and the performances you had before and after this commit? |
The performance drop was seen in the test test_text_generation_bf16_1x[microsoft/phi-2-1-False-False] |
…into model_phi_performance
|
@kkoryun, internal checks are passing and the code changes are clear, but while testing I collected mixed results. I'm testing on 1.21.0-555 and adding your file on top of v1.18.0 release. Here is what I get: PT_HPU_LAZY_MODE=1 python -m pytest tests/test_text_generation_example.py --device <device> -v -s --token=<token> --junitxml=/tmp/test_run_microsoft_phi.xml --log-cli-level 20 -k test_text_generation_bf16_1x[microsoft/phi-2-1-False-False]
I tested also the fp8 one: PT_HPU_LAZY_MODE=1 python -m pytest tests/test_text_generation_example.py --device <device> -v -s --token=<token> --junitxml=/tmp/test_run_microsoft_phi.xml --log-cli-level 20 -k test_text_generation_fp8[microsoft/phi-2-1-1-True-128-128]
Also, this a5ef754 commit didn't show any difference in my tests. |
…into model_phi_performance
|
Closing it. Will be covered by #2165 |
What does this PR do?
This PR fixes the performance drop for the Phi model
Before submitting