-
Notifications
You must be signed in to change notification settings - Fork 109
Streamline find_matches function #4316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the find_matches
function to improve performance by moving computationally expensive operations outside of the main matching loop. The key optimization separates tracing-enabled and tracing-disabled execution paths to prevent tracing overhead from interfering with compiler optimizations.
- Extracted match runner creation logic to avoid repeated construction of matchers and environment variable checks
- Separated tracing and non-tracing execution paths to enable better compiler optimization
- Moved environment variable evaluations and trace filter checks outside the loop
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #4316 +/- ##
===========================================
- Coverage 92.23% 92.14% -0.09%
===========================================
Files 557 557
Lines 25924 25945 +21
===========================================
- Hits 23909 23905 -4
- Misses 2015 2040 +25
🚀 New features to boost your workflow:
|
Co-authored-by: Copilot <[email protected]>
This build is not recommended to merge 🔴 |
❌bert-mrpc-tf: ERROR - check error output2025-09-24 19:07:17.811682: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. Traceback (most recent call last): File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 359, in main() File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 306, in main graph = load_tf_graph(model_name) File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 300, in load_tf_graph graph_def.ParseFromString(f.read()) File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/lib/io/file_io.py", line 116, in read self._preread_check() File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/lib/io/file_io.py", line 77, in _preread_check self._read_buf = _pywrap_file_io.BufferedInputStream( tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme '[local]' not implemented (file: '/new-saved-models/tf-misc/bert_mrpc1.pb') 🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output🔴mask-rcnn: FAILED: MIGraphX is not within tolerance - check verbose output |
Motivation
Refactor
find_matches
to help improve performance by moving many things outside of the loop.Technical Details
The
find_matches
function would do many things inside of the loop even when tracing is not enabled:matcher()
to construct the matcher, which is usually a no-op but in some cases we are constructing maps, etc, so we can benefit from constructing it onceThis also separates the checking with and without tracing enabled, so the tracing wont interfere with optimizations the compiler might do.
Changelog Category