Streamline find_matches function #4316

pfultz2 · 2025-09-24T16:03:00Z

Motivation

Refactor find_matches to help improve performance by moving many things outside of the loop.

Technical Details

The find_matches function would do many things inside of the loop even when tracing is not enabled:

Checking if the trace filter matches
Run time on the apply function
Call to matcher() to construct the matcher, which is usually a no-op but in some cases we are constructing maps, etc, so we can benefit from constructing it once

This also separates the checking with and without tracing enabled, so the tracing wont interfere with optimizations the compiler might do.

Changelog Category

- Added: New functionality.
- Changed: Changes to existing functionality.
- Removed: Functionality or support that has been removed. (Compared to a previous release)
- Optimized: Component performance that has been optimized or improved.
- Resolved Issues: Known issues from a previous version that have been resolved.
- Not Applicable: This PR is not to be included in the changelog.

Copilot

Pull Request Overview

This PR refactors the find_matches function to improve performance by moving computationally expensive operations outside of the main matching loop. The key optimization separates tracing-enabled and tracing-disabled execution paths to prevent tracing overhead from interfering with compiler optimizations.

Extracted match runner creation logic to avoid repeated construction of matchers and environment variable checks
Separated tracing and non-tracing execution paths to enable better compiler optimization
Moved environment variable evaluations and trace filter checks outside the loop

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

src/include/migraphx/matcher.hpp

codecov · 2025-09-24T21:24:22Z

Codecov Report

❌ Patch coverage is 34.92063% with 41 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/include/migraphx/matcher.hpp	34.92%	41 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #4316      +/-   ##
===========================================
- Coverage    92.23%   92.14%   -0.09%     
===========================================
  Files          557      557              
  Lines        25924    25945      +21     
===========================================
- Hits         23909    23905       -4     
- Misses        2015     2040      +25

Files with missing lines	Coverage Δ
src/include/migraphx/matcher.hpp	`87.20% <34.92%> (-7.08%)`	⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-authored-by: Copilot <[email protected]>

migraphx-bot · 2025-09-25T00:47:18Z

Test	Batch	Rate new 6000a7	Rate old 38fdc6	Diff	Compare
torchvision-resnet50	64	3,156.85	3,173.83	-0.54%	✅
torchvision-resnet50_fp16	64	6,594.09	6,613.50	-0.29%	✅
torchvision-densenet121	32	2,437.24	2,445.12	-0.32%	✅
torchvision-densenet121_fp16	32	4,115.10	4,132.06	-0.41%	✅
torchvision-inceptionv3	32	1,665.76	1,673.47	-0.46%	✅
torchvision-inceptionv3_fp16	32	2,584.21	2,596.39	-0.47%	✅
cadene-inceptionv4	16	795.80	797.27	-0.19%	✅
cadene-resnext64x4	16	802.54	806.10	-0.44%	✅
slim-mobilenet	64	8,212.72	8,232.04	-0.23%	✅
slim-nasnetalarge	64	221.77	222.86	-0.49%	✅
slim-resnet50v2	64	3,296.95	3,305.41	-0.26%	✅
bert-mrpc-onnx	8	1,135.68	1,144.06	-0.73%	✅
bert-mrpc-tf	1	487.56	486.42	0.24%	✅
pytorch-examples-wlang-gru	1	318.36	309.90	2.73%	✅
pytorch-examples-wlang-lstm	1	432.72	387.21	11.75%	🔆
torchvision-resnet50_1	1	808.46	745.35	8.47%	🔆
cadene-dpn92_1	1	427.85	428.94	-0.25%	✅
cadene-resnext101_1	1	368.14	369.63	-0.40%	✅
onnx-taau-downsample	1	397.97	399.22	-0.31%	✅
dlrm-criteoterabyte	1	31.92	32.03	-0.35%	✅
dlrm-criteoterabyte_fp16	1	51.02	51.11	-0.18%	✅
agentmodel	1	9,854.45	9,659.41	2.02%	✅
unet_fp16	2	59.05	59.19	-0.24%	✅
resnet50v1_fp16	1	991.47	991.39	0.01%	✅
resnet50v1_int8	1	990.03	971.32	1.93%	✅
bert_base_cased_fp16	64	1,098.63	1,104.24	-0.51%	✅
bert_large_uncased_fp16	32	343.76	345.64	-0.54%	✅
bert_large_fp16	1	197.69	198.02	-0.17%	✅
distilgpt2_fp16	16	2,076.51	2,085.17	-0.42%	✅
yolov5s	1	588.64	588.82	-0.03%	✅
tinyllama	1	43.82	43.95	-0.31%	✅
vicuna-fastchat	1	45.12	45.27	-0.34%	✅
whisper-tiny-encoder	1	410.05	410.98	-0.23%	✅
whisper-tiny-decoder	1	414.55	415.37	-0.20%	✅
llama2_7b	1	19.11	19.15	-0.21%	✅
qwen1.5-7b	1	23.45	23.53	-0.35%	✅
phi3-3.8b	1	26.63	26.70	-0.26%	✅
mask-rcnn	1	12.17	12.24	-0.57%	✅
llama3-8b	1	21.64	21.74	-0.48%	✅
whisper-large-encoder	1	10.17	10.22	-0.47%	✅
whisper-large-decoder	1	99.49	99.83	-0.35%	✅
mistral-7b	1	23.66	23.74	-0.32%	✅
FLUX.1-schnell	1	714.67	721.05	-0.88%	✅
nan	nan	nan	nan	nan%	❌

This build is not recommended to merge 🔴

migraphx-bot · 2025-09-25T00:47:21Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

❌bert-mrpc-tf: ERROR - check error output

2025-09-24 19:07:17.811682: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 359, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 306, in main
graph = load_tf_graph(model_name)
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 300, in load_tf_graph
graph_def.ParseFromString(f.read())
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/lib/io/file_io.py", line 116, in read
self._preread_check()
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/lib/io/file_io.py", line 77, in _preread_check
self._read_buf = _pywrap_file_io.BufferedInputStream(
tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme '[local]' not implemented (file: '/new-saved-models/tf-misc/bert_mrpc1.pb')

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

✅ bert_large: PASSED: MIGraphX meets tolerance

✅ yolov5s: PASSED: MIGraphX meets tolerance

✅ tinyllama: PASSED: MIGraphX meets tolerance

✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

✅ llama2_7b: PASSED: MIGraphX meets tolerance

✅ qwen1.5-7b: PASSED: MIGraphX meets tolerance

✅ phi3-3.8b: PASSED: MIGraphX meets tolerance

🔴mask-rcnn: FAILED: MIGraphX is not within tolerance - check verbose output

✅ llama3-8b: PASSED: MIGraphX meets tolerance

✅ whisper-large-decoder: PASSED: MIGraphX meets tolerance

✅ mistral-7b: PASSED: MIGraphX meets tolerance

✅ FLUX.1-schnell: PASSED: MIGraphX meets tolerance

pfultz2 added 2 commits September 24, 2025 00:37

Refactor find_matches

c78363f

Format

40eea7b

pfultz2 requested a review from causten as a code owner September 24, 2025 16:03

causten changed the title ~~Streamline find_matches funtion~~ Streamline find_matches function Sep 24, 2025

causten requested a review from Copilot September 24, 2025 16:24

Copilot AI reviewed Sep 24, 2025

View reviewed changes

src/include/migraphx/matcher.hpp Outdated Show resolved Hide resolved

Update src/include/migraphx/matcher.hpp

6000a79

Co-authored-by: Copilot <[email protected]>

pfultz2 mentioned this pull request Oct 4, 2025

add flag for full dynamic mode #4347

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Streamline find_matches function #4316

Streamline find_matches function #4316

pfultz2 commented Sep 24, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

codecov bot commented Sep 24, 2025 •

edited

Loading

Uh oh!

migraphx-bot commented Sep 25, 2025

Uh oh!

migraphx-bot commented Sep 25, 2025

Uh oh!

Uh oh!

Streamline find_matches function #4316

Are you sure you want to change the base?

Streamline find_matches function #4316

Conversation

pfultz2 commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Technical Details

Changelog Category

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

codecov bot commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

migraphx-bot commented Sep 25, 2025

Uh oh!

migraphx-bot commented Sep 25, 2025

Uh oh!

Uh oh!

pfultz2 commented Sep 24, 2025 •

edited

Loading

codecov bot commented Sep 24, 2025 •

edited

Loading