[FEA] Need a Python interface for accessing the C++ cuda_async_view_memory_resource #1611

lilohuang · 2024-07-15T08:41:38Z

Hi @leofang and all,

Currently, the librmm library provides a CudaAsyncMemoryResource Python interface (https://docs.rapids.ai/api/rmm/stable/python_api/#rmm.mr.CudaAsyncMemoryResource), which lets users access the C++ cuda_async_memory_resource class. However, there is no equivalent Python interface for accessing the C++ cuda_async_view_memory_resource class.

import rmm
import cupy
from rmm.allocators.cupy import rmm_cupy_allocator

rmm.mr.set_current_device_resource(rmm.mr.CudaAsyncMemoryResource(1024, 1024))
cupy.cuda.set_allocator(rmm_cupy_allocator)

The CudaAsyncMemoryResource always requires creating a new memory pool through cudaMemPoolCreate (https://github.com/rapidsai/rmm/blob/branch-24.08/include/rmm/mr/device/cuda_async_memory_resource.hpp#L107), which is problematic when integrating librmm and cuDF usage into an existing GPU-accelerated application.

An existing GPU-accelerated application needs to use the default CUDA memory pool for stream-ordered memory allocation and deallocation through cudaMallocAsync and cudaFreeAsync. If the librmm library always creates a separate pool, the memory pool allocated by librmm can only be used by itself or by cuPy and cuDF, causing some GPU memory resource wastage.

We hope that librmm can provide a CudaAsyncViewMemoryResource Python interface, similar to the one shown below, to access the C++ cuda_async_view_memory_resource class. This would allow us to pass the default memory pool handle (obtained from cudaDeviceGetDefaultMemPool) to librmm:

import rmm
import cupy
from rmm.allocators.cupy import rmm_cupy_allocator

rmm.mr.set_current_device_resource(rmm.mr.CudaAsyncViewMemoryResource(pool_handle))
cupy.cuda.set_allocator(rmm_cupy_allocator)

Alternatively, introduce rmm.mr.CudaAsyncDefaultMemoryResource(), which automatically obtains the pool handle from cudaDeviceGetDefaultMemPool:

import rmm
import cupy
from rmm.allocators.cupy import rmm_cupy_allocator

rmm.mr.set_current_device_resource(rmm.mr.CudaAsyncDefaultMemoryResource())
cupy.cuda.set_allocator(rmm_cupy_allocator)

Thanks,
Lilo

The text was updated successfully, but these errors were encountered:

wence- · 2024-07-17T09:25:54Z

Thanks, this should be quite doable. We need to think about what type the pool_handle should have in python, probably a cuda-python cudart.cudaMemPool_t?

lilohuang · 2024-07-17T10:48:50Z

@wence- The cuda-python cudart.cudaMemPool_t appears to be a viable option for me. Thanks.

lilohuang added ? - Needs Triage Need team to review and classify feature request New feature or request labels Jul 15, 2024

Matt711 self-assigned this Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Need a Python interface for accessing the C++ cuda_async_view_memory_resource #1611

[FEA] Need a Python interface for accessing the C++ cuda_async_view_memory_resource #1611

lilohuang commented Jul 15, 2024 •

edited

Loading

wence- commented Jul 17, 2024

lilohuang commented Jul 17, 2024

[FEA] Need a Python interface for accessing the C++ cuda_async_view_memory_resource #1611

[FEA] Need a Python interface for accessing the C++ cuda_async_view_memory_resource #1611

Comments

lilohuang commented Jul 15, 2024 • edited Loading

wence- commented Jul 17, 2024

lilohuang commented Jul 17, 2024

lilohuang commented Jul 15, 2024 •

edited

Loading