Skip to content

Conversation

kkraus14
Copy link
Collaborator

Description

Fixes #839

Adds functions for searching in conda environments for libraries and uses the functions before doing the general system search. This should resolve the problem for when you have mismatched versions between a conda environment and the underlying system CTK and pathfinder picks up the system library.

There's still a lot of TODOs with some open questions in this PR and I didn't add any testing yet, but figured I should open this to drive some conversation.

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Copy link
Contributor

copy-pr-bot bot commented Aug 28, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Comment on lines +216 to +226
try:
handle = _load_lib(libname, soname)
except OSError:
pass
else:
# TODO KEITH: Do we need this abs_path_for_dynamic_library call?
# We're already resolving the absolute path based on the conda environment variables
abs_path = abs_path_for_dynamic_library(libname, handle)
if abs_path is None:
raise RuntimeError(f"No expected symbol for {libname=!r}")
return LoadedDL(abs_path, False, handle._handle)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be refactored into a helper function. I copy pasted it from load_with_system_search

except OSError:
pass
else:
# TODO KEITH: Do we need this abs_path_for_dynamic_library call?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwgk was this only needed in the case we were relying on dlopen / windows equivalent to find the library for us?

Comment on lines +163 to +168
handle = kernel32.LoadLibraryExW(dll_name, None, 0)
if handle:
# TODO KEITH: Do we need this abs_path_for_dynamic_library call?
# We're already resolving the absolute path based on the conda environment variables
abs_path = abs_path_for_dynamic_library(libname, handle)
return LoadedDL(abs_path, False, ctypes_handle_to_unsigned_int(handle))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 we should probably refactor this into a helper function

@@ -23,3 +23,8 @@ Highlights
- Improves stability in general and supports nvmath specifically
- Proactive change to improve library loading consistency
- Drops boilerplate docstrings for private functions

* Add conda search path prioritization for ``load_nvidia_dynamic_lib()`` (`PR #856 <https://github.com/NVIDIA/cuda-python/pull/856>`_)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Add conda search path prioritization for ``load_nvidia_dynamic_lib()`` (`PR #856 <https://github.com/NVIDIA/cuda-python/pull/856>`_)
* Add conda search path prioritization for ``load_nvidia_dynamic_lib()`` (`PR #919 <https://github.com/NVIDIA/cuda-python/pull/919>`_)

@kkraus14
Copy link
Collaborator Author

/ok to test

Copy link

@kkraus14 kkraus14 requested review from rwgk and leofang August 28, 2025 15:57
Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, I was also thinking having an explicit step for conda will be best. I'm learning a lot from looking at your implementation.

Do you want me to take over, to make this more DRY, and then iterate on the details from there?

in_conda_env = False
if os.getenv("CONDA_BUILD") == "1":
in_conda_build = True
elif os.getenv("CONDA_PREFIX"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent a few minutes quizzing ChatGPT about the most robust way to figure out if we're in a conda environment. It produced this:


Typical code snippet looks like:

import os, sys

def in_conda_env() -> bool:
    return os.path.exists(os.path.join(sys.prefix, "conda-meta"))

That is generally considered the most robust test.


Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had thought about this as well, but ultimately decided on using the $CONDA_PREFIX approach for a few reasons:

  • Conda environment activation is guaranteed to set the environment variable and conda usage in general is dependent on it
  • Aligned implementation between conda-build and general conda environments
  • You can have a conda environment without python, where technically you could have a pure C++ environment and someone could want to use cuda-pathfinder as a utility in a C++ build pipeline where Python would potentially be coming from outside of the conda environment, in which case sys.prefix wouldn't properly point inside of the activated conda environment.

@leofang leofang added enhancement Any code-related improvements P0 High priority - Must do! cuda.pathfinder Everything related to the cuda.pathfinder module labels Aug 28, 2025
@leofang leofang added this to the cuda-pathfinder parking lot milestone Aug 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda.pathfinder Everything related to the cuda.pathfinder module enhancement Any code-related improvements P0 High priority - Must do!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG]: pathfinder load_with_system_search tries to load old libraries first
3 participants