Skip to content

Conversation

@JH-Leon-KIM-AMD
Copy link

@JH-Leon-KIM-AMD JH-Leon-KIM-AMD commented Aug 4, 2025

@amd-bartgips , @vpietila-amd , @amd-ahyttine
3D tunanet code PR

  • Add Gfx942Model_3D class for 3D convolution predictions
  • Implement metadata_3d.cpp for 3D feature extraction
  • Move common AI heuristics code to ai_heuristics.hpp for better organization
  • Functionalization for reusing and improving inefficient memory cache logic
  • Add trained 3D TunaNet models for gfx942
  • Update CMakeLists.txt for 3D components
  • conflict test done with Bart latest commit

Test script

export MIOPEN_LOG_LEVEL=4
export MIOPEN_DEBUG_ENABLE_AI_IMMED_MODE_FALLBACK=1
export MIOPEN_ENABLE_AI_IMMED_MODE_FALLBACK=1
export MIOPEN_DISABLE_PERFDB=1
export MIOPEN_DISABLE_USERDB=1
export MIOPEN_DISABLE_FIND_DB=1
export MIOPEN_DISABLE_CACHE=1
export MIOPEN_DEBUG_CONV_IMMED_FALLBACK=1
export MIOPEN_DEBUG_FORCE_IMMED_MODE_FALLBACK=1
#3D convolution
./bin/MIOpenDriver conv -F 1 -n 1 -c 4 -k 8 -H 8 -W 8 -! 8 -y 3 -x 3 -@ 3 -_ 3 -t fp32 -V 0

  - Add Gfx942Model_3D class for 3D convolution predictions
  - Implement metadata_3d.cpp for 3D feature extraction
  - Add trained 3D TunaNet models for gfx942
  - Update CMakeLists.txt for 3D components

namespace miopen {
namespace ai {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this refactor to miopen::ai::common, makes sense to me.

Copy link

@amd-bartgips amd-bartgips Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question from a C++ beginner: Why would we place these function definitions in the .hpp file instead of the .cpp file?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that MIOpen wants have the option to distribute the library as a header-only library, where the user can compile the library with their own code. When the implementation is in header files, this is possible. If you would place the implementation to .cpp files, you would need to distribute compiled libraries. This create problems with compiler/runtime versions since users of MIOpen would need to have versions that are compatible with the compiled binaries. Header-only library is free of such dependencies.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, since you already have the .cpp file, I would move all implementations to the .cpp file. The hybrid is just confusing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So only declarations (signature) in the header file and the implementation in the .cpp file.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to this: I seem to recall John Shumway saying something about being header-only in our meeting yesterday... or was that only about CK?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created metadata_3d.cpp, which needed to use ReverseMap and LoadJSON functions that were originally in hpp file. To make these functions accessible to both ai_heuristics.cpp and metadata_3d.cpp, I moved them to a header file (.hpp).

/**
* @brief 3D-specific metadata handler for TunaNet3D models
*
* This class provides a simple interface for accessing 3D convolution metadata.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you need to alter the metadata json structure at all? Or did you simply use the current file format that our python code spits out?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the JSON structure generated from the Python code as is.

}

bool IsProblemSupported(const conv::ProblemDescription& problem,
const ExecutionContext& ctx) const override

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I receive a warning when building, since ctx is not used, perhaps do:

Suggested change
const ExecutionContext& ctx) const override
[[maybe_unused]] const ExecutionContext& ctx) const override

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is easier to simply write

bool IsProblemSupported(const conv::ProblemDescription& problem,
                            const ExecutionContext&) const override

If you are not using the ctx. This should remove the compiler warning.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Bart and Ville, I corrected.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Ville rightfully pointed out. Perhaps we should make these git-lfs files. Can you perhaps check if this is the case with the other tunanet files?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently in Git LFS:

  • Only *.kdb.bz2 files

NOT in Git LFS

  • All *.tn.model files (TunaNet models) - ranging from 2KB to 1.1MB
  • All *.ktn.model files (Kernel Tuning models) - ranging from 827 bytes to
    1.1MB

If we need to track git-lfs for model files then

  git lfs track "*.tn.model"
  git lfs track "*.ktn.model"
  git add .gitattributes
  git add src/kernels/*.tn.model src/kernels/*.ktn.model
  git commit -m "Move TunaNet models to Git LFS"

@amd-bartgips
Copy link

When I run your commands suggested above it is better to set the log level higher:

export MIOPEN_LOG_LEVEL=6
export MIOPEN_DEBUG_ENABLE_AI_IMMED_MODE_FALLBACK=1
export MIOPEN_ENABLE_AI_IMMED_MODE_FALLBACK=1
export MIOPEN_DISABLE_PERFDB=1
export MIOPEN_DISABLE_USERDB=1
export MIOPEN_DISABLE_FIND_DB=1
export MIOPEN_DISABLE_CACHE=1
export MIOPEN_DEBUG_CONV_IMMED_FALLBACK=1
export MIOPEN_DEBUG_FORCE_IMMED_MODE_FALLBACK=1
#3D convolution
./bin/MIOpenDriver conv -F 1 -n 1 -c 4 -k 8 -H 8 -W 8 -! 8 -y 3 -x 3 -@ 3 -_ 3 -t fp32 -V 0

So now it's easier to see that your new stuff is properly being called :)

MIOpen(HIP): Info2 [FindRecordUnsafe] Looking for key 4-8-8-8-3x3x3-8-6-6-6-1-0x0x0-1x1x1-1x1x1-0-NCDHW-FP32-F in cache for file "/root/.config/miopen/gfx942130.HIP.3_5_0_b09b18fb8-dirty.ufdb.txt"
MIOpen(HIP): Info2 [FindRecord] Looking for key 4-8-8-8-3x3x3-8-6-6-6-1-0x0x0-1x1x1-1x1x1-0-NCDHW-FP32-F in file "/home/bartgips/code/MIOpen_dev/build/share/miopen/db/gfx942130.HIP.fdb.txt"
MIOpen(HIP): Info2 [Measure] Db::FindRecord time: 0.105818 ms
MIOpen(HIP): Info2 [FindRecordUnsafe] Looking for key 4-8-8-8-3x3x3-8-6-6-6-1-0x0x0-1x1x1-1x1x1-0-NCDHW-FP32-F in cache for file ":memory:3d_gfx942"
MIOpen(HIP): Info2 [Get3DModel] Get3DModel called for device: gfx942
MIOpen(HIP): Info2 [Metadata3D] Metadata3D loaded for arch: gfx942_3d, num_inputs=23, num_solvers=9
MIOpen(HIP): Info2 [Gfx942Model_3D] Gfx942Model_3D initialized
MIOpen(HIP): Info2 [Get3DModel] Successfully created 3D model for device: gfx942
MIOpen(HIP): Info2 [IsProblemSupported] 3D problem supported by Gfx942Model_3D
MIOpen(HIP): Info2 [PredictSolver] Evaluating 3D TunaNet
MIOpen(HIP): Info2 [ToFeatures] Gfx942Model_3D: Extracted 23 features
MIOpen(HIP): Info2 [Forward] Gfx942Model_3D: Extracted 23 features
...
MIOpen(HIP): Info2 [Forward] Gfx942Model_3D: TunaNet3D returned 9 predictions
MIOpen(HIP): Info2 [PredictSolver] 3D TunaNet Result: ConvHipImplicitGemm3DGroupFwdXdlops ID:138, GemmFwdRest ID:91, ConvHipImplicitGemm3DGroupBwdXdlops ID:141, ConvHipImplicitGemm3DGroupWrwXdlops ID:140, GemmFwd1x1_0_1 ID:88, GemmWrwUniversal ID:102, GemmBwdRest ID:97, GemmBwd1x1_stride1 ID:96, GemmWrw1x1_stride1 ID:101, 
MIOpen(HIP): Info2 [GetSolutionsFallback] Using TunaNet Fallback
MIOpen(HIP): Info2 [GetSolutionsFallback] maxSolutionCount = 1, available = 2
MIOpen(HIP): Info2 [GetSolutionsFallback] id: 138, algo: 5, time: 10, ws: 18688, name: ConvHipImplicitGemm3DGroupFwdXdlops
MIOpen(HIP): Info2 [GetSolutionsFallback] id: 91, algo: 0, time: 20, ws: 93312, name: GemmFwdRest

Any idea why it gives us two solutions here at the end? (ConvHipImplicitGemm3DGroupFwdXdlops, GemmFwdRest). Shouldn't the tunanet only give us one of them? Or is one of the two a hard-coded fallback?

Besides this, as I said yesterday: it would be nice to have explicit test functions for you specific functions :)

Copy link

@vpietila-amd vpietila-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

*
* MIT License
*
* Copyright (c) 2023 Advanced Micro Devices, Inc.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the correct year.

} catch (...) {
success = false;
return {};
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a better pattern is to return std::optional<std::vector<std::string>> where std::nullopt indicates failure. This would apply to all LoadXXX methods. Then you don't need additional success variable. Generally, the functional style of programming where functions do not have side-effects (such as modifying value of an input variable) is safer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know the std::optional. Thank you. I've fixed the code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will keep in mind "So only declarations (signature) in the header file and the implementation in the .cpp file."
Since this was existing code, I'll probably need to refactor it after merging it with Bart. If I modify the existing code, Bart's code will likely stop working.
Thank you!!


namespace miopen {
namespace ai {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, since you already have the .cpp file, I would move all implementations to the .cpp file. The hybrid is just confusing.


namespace miopen {
namespace ai {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So only declarations (signature) in the header file and the implementation in the .cpp file.

  - Add 13 unit tests covering 3D AI heuristics functionality
  - Test metadata loading, model creation, and solver predictions
  - Add MIOpenDriver equivalent test (BartTest) for real-world validation
  - Enhance logging to show solver predictions with scores
  - Verify correct solver selection for 3D convolution problems

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think perhaps this could be slightly renamed to be more accurate? I.e. this is only to test the "tunanet" part, right? So maybe name it something like that? Or something with "solver_selection".
This is just to make sure that this does not test the heuristics related to kernel tuning/parameter selection.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, this all looks good! All tests pass on my fresh build 🥳

@amd-bartgips amd-bartgips merged commit ebed5cc into miopenff/3d-heuristic Aug 12, 2025
1 check was pending
@amd-bartgips amd-bartgips deleted the jeonghyun/3d-heuristic-clean branch August 12, 2025 07:34
@amd-bartgips
Copy link

Merging this, next step: #3923

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants