Skip to content

Conversation

harakas
Copy link
Contributor

@harakas harakas commented Oct 2, 2025

Motivation

I'm building migraphx on top of TheRock and TheRock builds llvm with shared libraries. When I run migraphx I get this error:

$ /opt/rocm/bin/migraphx-driver compile model.onnx --gpu --enable-offload-copy  -o out.mxr
Running [ MIGraphX Version: 2.13.0.20250912-42-0-g524839ac9-dirty ]: /opt/rocm/bin/migraphx-driver compile model.onnx --gpu --enable-offload-copy -o model.mxr
[2025-10-01 17:31:57]
Compiling ... 
Reading: model.onnx
LLVM ERROR: Option 'default' already exists!
Aborted (core dumped)

Apparently the llvm that is pulled into migraphx from rocMLIR is conflicting with the llvm loaded by amd-comgr. Now MIOpen is also using its own rocMLIR but has no such issue. I found that it uses -Wl,--exclude-libs,ALL to get over this: https://github.com/ROCm/rocm-libraries/blob/41db72db8babe44bb95eeab0c10526082e41cb18/projects/miopen/src/CMakeLists.txt#L1028

So here is a patch for the same effect in migraphx_gpu.

The second commit in this PR adds a build option MIGRAPHX_STRIP_SYMBOLS to strip in release mode. It's OFF by default so should not break anything. Again copied from MIOpen project: https://github.com/ROCm/rocm-libraries/blob/41db72db8babe44bb95eeab0c10526082e41cb18/projects/miopen/CMakeLists.txt#L132-L136

This results in smaller distribution size.

Technical Details

Changelog Category

    • Added: New functionality.
    • Changed: Changes to existing functionality.
    • Removed: Functionality or support that has been removed. (Compared to a previous release)
    • Optimized: Component performance that has been optimized or improved.
    • Resolved Issues: Known issues from a previous version that have been resolved.
    • Not Applicable: This PR is not to be included in the changelog.

@harakas harakas requested a review from causten as a code owner October 2, 2025 10:25
@causten causten requested a review from pfultz2 October 2, 2025 13:31
target_link_libraries(migraphx_gpu PRIVATE rocMLIR::rockCompiler)
# Hide LLVM internals that come from rocMLIR.
if(NOT WIN32 AND NOT APPLE)
target_link_options(migraphx_gpu PRIVATE "LINKER:--exclude-libs,librockCompiler.a")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use the rocMLIR::rockCompiler cmake target so we get the correct usage requirements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a running joke around here: the greatest achievement of humanity in 21st century that comes from inventing AI will not be solving cancer, world peace or alike, but rather it will be of finally making sense of cmake.

I'm not good with cmake so this is the solution GPT offers:

# ============================================================================ #
#  link_excluding_static_archives.cmake
#
#  CMake 3.15+ helper for ELF toolchains to prevent static archive symbols
#  (dragged in by imported/aliased targets like rocMLIR::rockCompiler) from
#  being re-exported by your shared library.
#
#  Usage (replace your direct link line):
#    # Before:
#    # target_link_libraries(migraphx_gpu PRIVATE rocMLIR::rockCompiler)
#
#    # After (shorthand):
#    link_excluding_static_archives(migraphx_gpu rocMLIR::rockCompiler)
#
#  You can pass multiple roots:
#    link_excluding_static_archives(mylib A::root B::otherRoot)
#
#  What it does:
#    - Recursively walks INTERFACE_LINK_LIBRARIES in *target space* for each ROOT
#      (robust; catches transitive CMake targets).
#    - Adds -Wl,--exclude-libs,<archive> for every transitive STATIC library target
#      (using $<TARGET_LINKER_FILE_NAME:...> so names aren’t hard-coded).
#    - Handles *literal* non-target entries: absolute/relative ".a" paths and
#      plain "-lfoo" (converted to "libfoo.a").
#    - **Warns loudly** and **skips** non-target entries that are **generator
#      expressions** (e.g. $<LINK_ONLY:-lfoo>, $<IF:...,/path/libbar.a,>), because
#      CMake **3.15 cannot evaluate/transform** those into archive names.
#    - Only applies on ELF (NOT WIN32, NOT APPLE).
#
#  Strong recommendation:
#    If warnings about ignored generator expressions show up, consider switching
#    to explicit symbol export control (hidden-by-default visibility and/or an ELF
#    version script) so you don’t depend on --exclude-libs at all.
# ============================================================================ #

# -------------------------- Internal helper (BFS) ----------------------------- #
# Recursively collect all transitive *targets* reachable via INTERFACE_LINK_LIBRARIES.
function(_link_exc_collect_iface_targets OUT ROOT)
  if(NOT TARGET ${ROOT})
    set(${OUT} "" PARENT_SCOPE)
    return()
  endif()

  set(_queue ${ROOT})
  set(_seen "")
  set(_result "")

  while(_queue)
    list(POP_FRONT _queue cur)

    if(cur IN_LIST _seen)
      continue()
    endif()
    list(APPEND _seen ${cur})
    list(APPEND _result ${cur})

    # Resolve aliases to their real targets so TYPE & properties are readable.
    get_target_property(_aliased ${cur} ALIASED_TARGET)
    if(_aliased)
      set(cur ${_aliased})
    endif()

    get_target_property(_deps ${cur} INTERFACE_LINK_LIBRARIES)
    foreach(item IN LISTS _deps)
      if(TARGET ${item})
        get_target_property(_aliased2 ${item} ALIASED_TARGET)
        if(_aliased2)
          set(item ${_aliased2})
        endif()
        list(APPEND _queue ${item})
      endif()
      # Non-targets are handled (or warned) elsewhere.
    endforeach()
  endwhile()

  list(REMOVE_DUPLICATES _result)
  set(${OUT} "${_result}" PARENT_SCOPE)
endfunction()

# -------- Internal helper: literals & warnings (CMake 3.15 limitations) ------- #
# Collect NON-TARGET literal archives and -lfoo across the same dependency walk.
# Warn when encountering generator expressions because 3.15 cannot rewrite them.
function(_link_exc_collect_literal_archives OUT ROOT)
  if(NOT TARGET ${ROOT})
    set(${OUT} "" PARENT_SCOPE)
    return()
  endif()

  set(_queue ${ROOT})
  set(_seen "")
  set(_archives "")

  # De-dup warning messages so logs stay readable.
  set(_warned_genex "")
  set(_warned_unknown "")

  while(_queue)
    list(POP_FRONT _queue cur)
    if(cur IN_LIST _seen)
      continue()
    endif()
    list(APPEND _seen ${cur})

    get_target_property(_deps ${cur} INTERFACE_LINK_LIBRARIES)
    foreach(item IN LISTS _deps)
      if(TARGET ${item})
        list(APPEND _queue ${item})
      else()
        # ------------------------ GENERATOR EXPRESSIONS ------------------------
        # Examples: $<LINK_ONLY:-lfoo>, $<IF:COND,/path/libbar.a,>, etc.
        # CMake 3.15 CANNOT evaluate or transform these into archive names.
        # We must IGNORE them and WARN, or we risk passing nonsense to the linker.
        if(item MATCHES "^\\$<.*>$")
          if(NOT item IN_LIST _warned_genex)
            list(APPEND _warned_genex ${item})
            message(WARNING
              "exclude-libs: Ignoring generator expression in INTERFACE_LINK_LIBRARIES: '${item}'. "
              "CMake 3.15 cannot evaluate/transform non-target genex into archive names. "
              "Consider upstreaming proper targets or enforcing explicit symbol exports.")
          endif()
          continue()
        endif()

        # ------------------------ Literal path ending in .a --------------------
        if(item MATCHES "\\.a$")
          get_filename_component(_fname "${item}" NAME) # we only need the archive name
          if(_fname)
            list(APPEND _archives "${_fname}")
          endif()
          continue()
        endif()

        # ----------------------------- Plain -lfoo -----------------------------
        if(item MATCHES "^-l(.+)$")
          string(REGEX REPLACE "^-l(.+)$" "lib\\1.a" _fname "${item}")
          list(APPEND _archives "${_fname}")
          continue()
        endif()

        # ------------------------- Unknown non-target --------------------------
        if(NOT item IN_LIST _warned_unknown)
          list(APPEND _warned_unknown ${item})
          message(WARNING
            "exclude-libs: Unrecognized non-target link item '${item}' in INTERFACE_LINK_LIBRARIES. "
            "Cannot derive an archive name for --exclude-libs. Entry will be ignored.")
        endif()
      endif()
    endforeach()
  endwhile()

  list(REMOVE_DUPLICATES _archives)
  set(${OUT} "${_archives}" PARENT_SCOPE)
endfunction()

# ------------------------------ Public function ------------------------------- #
# link_excluding_static_archives(<consumer_target> <root_target> [...])
#
# Creates (or reuses) a per-root INTERFACE wrapper that:
#   - Links the original <root_target>,
#   - Adds --exclude-libs for every transitive STATIC library target,
#   - Adds --exclude-libs for literal .a paths and -lfoo entries,
# and then links that wrapper to <consumer_target>.
#
# This keeps all the special sauce in ONE place instead of sprinkling flags
# across consumers. Safe to call multiple times; wrappers are de-duplicated.
function(link_excluding_static_archives CONSUMER)
  if(NOT TARGET ${CONSUMER})
    message(FATAL_ERROR
      "link_excluding_static_archives: CONSUMER target '${CONSUMER}' does not exist.")
  endif()

  # Remaining args are root link targets to wrap (e.g. rocMLIR::rockCompiler)
  set(_roots ${ARGN})
  if(NOT _roots)
    message(FATAL_ERROR
      "link_excluding_static_archives: no root targets provided (e.g. rocMLIR::rockCompiler).")
  endif()

  foreach(ROOT IN LISTS _roots)
    if(NOT TARGET ${ROOT})
      message(FATAL_ERROR
        "link_excluding_static_archives: ROOT target '${ROOT}' does not exist.")
    endif()

    # Sanitize a legal CMake target name for the wrapper (cannot contain '::')
    set(_iface "${ROOT}")
    string(REGEX REPLACE "[^A-Za-z0-9_]+" "_" _iface "${_iface}")
    set(_iface "${_iface}__exclude_archives_iface")

    # Build the wrapper once per root; reuse if it already exists.
    if(NOT TARGET ${_iface})
      # Gather dependency info
      _link_exc_collect_iface_targets(    _ALL_TARGETS ${ROOT})
      _link_exc_collect_literal_archives( _LITERAL_AR  ${ROOT})

      add_library(${_iface} INTERFACE)
      target_link_libraries(${_iface} INTERFACE ${ROOT})

      # Apply only on ELF platforms (not Windows/macOS)
      if(NOT WIN32 AND NOT APPLE)
        # Transitive STATIC targets → emit --exclude-libs,$<TARGET_LINKER_FILE_NAME:...>
        foreach(t IN LISTS _ALL_TARGETS)
          target_link_options(${_iface} INTERFACE
            "$<$<STREQUAL:$<TARGET_PROPERTY:${t},TYPE>,STATIC_LIBRARY>:LINKER:--exclude-libs,$<TARGET_LINKER_FILE_NAME:${t}>>"
          )
        endforeach()

        # Literal .a paths and -lfoo → add derived archive names directly
        foreach(arch IN LISTS _LITERAL_AR)
          target_link_options(${_iface} INTERFACE "LINKER:--exclude-libs,${arch}")
        endforeach()
      endif()
    endif()

    # Link consumer against the wrapper (which brings in ROOT and the options)
    target_link_libraries(${CONSUMER} PRIVATE ${_iface})
  endforeach()
endfunction()

and apply as

# Include the helper (adjust path)
include(cmake/link_excluding_static_archives.cmake)

# Use it instead of directly linking rocMLIR::rockCompiler
link_excluding_static_archives(migraphx_gpu rocMLIR::rockCompiler)

Is this what you had in mind?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly. That doesnt propagate the usage requirements, and it also doesnt handle the $<LINK_ONLY:...> dependencies(which is common to be uses with static libraries).

Discovering transitive archive names automatically is messy. Why cant we use LINKER:--exclude-libs,ALL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ALL does work for my application. Lets hope it doesn't break anything. I guess the tests would show that. I changed it to ALL.

# TODO: Fix rocMLIR's library to hide LLVM internals.
target_link_libraries(migraphx_gpu PRIVATE rocMLIR::rockCompiler)
# Hide LLVM internals that come from rocMLIR.
if(NOT WIN32 AND NOT APPLE)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need AND BUILD_SHARED_LIBS. I dont think this flag works with static library builds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the condition.

Copy link

codecov bot commented Oct 7, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #4345   +/-   ##
========================================
  Coverage    92.24%   92.25%           
========================================
  Files          560      560           
  Lines        26302    26310    +8     
========================================
+ Hits         24262    24270    +8     
  Misses        2040     2040           

see 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@causten causten merged commit 0647f86 into ROCm:develop Oct 8, 2025
60 of 65 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants