Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.to_numpy(copy=False) Runtime Error if Device Memory #201

Open
ax3l opened this issue Oct 6, 2023 · 2 comments
Open

.to_numpy(copy=False) Runtime Error if Device Memory #201

ax3l opened this issue Oct 6, 2023 · 2 comments
Labels
backend: cuda Specific to CUDA execution (GPUs) enhancement New feature or request

Comments

@ax3l
Copy link
Member

ax3l commented Oct 6, 2023

We allow users to call .to_numpy(copy=False) on arbitrary memory.

This is fine even with pure GPU memory, to either:

  • transport pointers around w/o access from the host or
  • use as managed memory from the host (read/write)

For the situation that the pointer is in GPU memory and not managed, we should instead raise a runtime exception with the hint to use .to_numpy(copy=True), .to_cupy(copy=False) or activate managed memory.

We can use AMReX_GpuUtility.H for isManaged, isDevicePtr, isPinnedPtr helpers. It wraps cudaPointerGetAttributes and, once later supported, similar functions for HIP and SYCL.
https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__UNIFIED.html#group__CUDART__UNIFIED_1gd89830e17d399c064a2f3c3fa8bb4390

@ax3l ax3l added enhancement New feature or request backend: cuda Specific to CUDA execution (GPUs) labels Oct 6, 2023
@ax3l
Copy link
Member Author

ax3l commented Oct 6, 2023

These function can be quite expensive, so we should either use them sparsely or check alternatively the arenas if we know them.

These situations in AMReX can create this:

  • in default Arena and managed allowed
  • in explicit Managed Arena
  • in explicit device arena and managed allowed

Then Arena has

    // isDeviceAccessible and isHostAccessible can both be true.                                                                                                             
    [[nodiscard]] virtual bool isDeviceAccessible () const;
    [[nodiscard]] virtual bool isHostAccessible () const;

    // Note that isManaged, isDevice and isPinned are mutually exclusive.                                                                                                    
    // For memory allocated by cudaMalloc* etc., one of them returns true.                                                                                                   
    // Otherwise, neither is true.                                                                                                                                           
    [[nodiscard]] virtual bool isManaged () const;
    [[nodiscard]] virtual bool isDevice () const;
    [[nodiscard]] virtual bool isPinned () const;

where isHostAccessible() is what we need.

@ax3l
Copy link
Member Author

ax3l commented Oct 7, 2023

For MultiFab.array(mfi).to_numpy() we could go on the MultiFab level and add a:

MultiFab.to_numpy(mfi, ...)

function, that way we have still access to the Arena (or implement the more costly helper calls on the pointer from AMReX_GpuUtility.H above).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: cuda Specific to CUDA execution (GPUs) enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant