You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Historically, the Cython wrappers around device allocation routines use the implicit device resource returned by get_current_device_resource and don't generally offer an interface for the user to provide a memory resource to perform the allocation.
This is "OK", but often (e.g. see #1514) we don't pass the python-side MR into C++ allocation routines.
We therefore actually don't have any control over the MR being used to perform the allocation, and so can (with a little, but not a lot, of gymnastics) get in a situation where an allocation is performed using resource-A but we store resource-B which is not compatible.
This latter dichotomy exists because set/get_current_device_resource exist in both the C++ and Cython wrappers, but don't talk to the same underlying map. On C++ set_current_device_resource stores a raw pointer to a memory resource in a std::unordered_map, and get_current_device_resource returns this raw pointer.
Unfortunately, on the Python side, we can't use a raw pointer since we can't control its lifetime. So in Python set_current_device_resource stores the Python MR in a dict and then calls the C++ set_current_device_resource on the raw pointer that backs the Python MR. So far, all good. But Python-level get_current_device_resource only inspects the dict (it doesn't call the C++ get_current_device_resource). This is where the problems can begin: if a C++ code calls set_current_device_resource after a Python call to the matching API function, the Python and C++ sides of things will no longer agree on what the current device resource is.
Describe the solution you'd like
All Python functionality that involves allocation should optionally take a memory resource argument (similarly to the C++ interface). This argument should default to the Python-side current device resource and should then be passed on to C++, so that by the time Python reaches C++ there is no implicit memory resource being obtained.
Describe alternatives you've considered
Promote interfaces in RMM to take managed, rather than raw pointers. Then we can always arrange that C++ and Python are consistent since we can safely maintain the data in C++ and borrow it from Python. However, this goes against the current strategy using cuda::memory_resource.
Unfortunately, on the Python side, we can't use a raw pointer since we can't control its lifetime.
Naive question: why do we need Python to control the lifetime of the current device resource? Why not just let the C++ library control its lifetime. After all, an application might be using RMM from Python and using a Python module with a C++ core that uses RMM only from C++. These need to have the same resource controls. Setting the current device resource from one place should affect the current device resource everywhere.
Is your feature request related to a problem? Please describe.
Historically, the Cython wrappers around device allocation routines use the implicit device resource returned by
get_current_device_resource
and don't generally offer an interface for the user to provide a memory resource to perform the allocation.This is "OK", but often (e.g. see #1514) we don't pass the python-side MR into C++ allocation routines.
We therefore actually don't have any control over the MR being used to perform the allocation, and so can (with a little, but not a lot, of gymnastics) get in a situation where an allocation is performed using resource-A but we store resource-B which is not compatible.
This latter dichotomy exists because
set/get_current_device_resource
exist in both the C++ and Cython wrappers, but don't talk to the same underlying map. On C++set_current_device_resource
stores a raw pointer to a memory resource in astd::unordered_map
, andget_current_device_resource
returns this raw pointer.Unfortunately, on the Python side, we can't use a raw pointer since we can't control its lifetime. So in Python
set_current_device_resource
stores the Python MR in adict
and then calls the C++set_current_device_resource
on the raw pointer that backs the Python MR. So far, all good. But Python-levelget_current_device_resource
only inspects thedict
(it doesn't call the C++get_current_device_resource
). This is where the problems can begin: if a C++ code callsset_current_device_resource
after a Python call to the matching API function, the Python and C++ sides of things will no longer agree on what the current device resource is.Describe the solution you'd like
All Python functionality that involves allocation should optionally take a memory resource argument (similarly to the C++ interface). This argument should default to the Python-side current device resource and should then be passed on to C++, so that by the time Python reaches C++ there is no implicit memory resource being obtained.
Describe alternatives you've considered
Promote interfaces in RMM to take managed, rather than raw pointers. Then we can always arrange that C++ and Python are consistent since we can safely maintain the data in C++ and borrow it from Python. However, this goes against the current strategy using
cuda::memory_resource
.Additional context
See also:
rmm_cupy_allocator
with Cupy, the memory resource is constructed twice. #1506The text was updated successfully, but these errors were encountered: