Comments (4)
I think there should be no default parameters, to force the python-side caller to be fully explicit about stream and memory management.
My rationale is that since RMM objects are (deliberately) not smart pointers, there is no way to safely take ownership of an rmm::device_buffer
from Python. Specifically, if I am passed a device_buffer
whose memory resource is of a provenance I do not control, I cannot guarantee that I keep the memory resource alive for the lifetime of the buffer.
Indeed, in RMM, we don't even try!
We just do:
buf.c_obj = move(ptr)
buf.mr = get_current_device_resource()
and fingers crossed, ptr.mr
is the same as get_current_device_resource()
.
There's no way to work around this, so really the only safe way is to document in RMM that if taking ownership of a device_buffer
from Python, one must only do so when the buffer has been allocated via a memory resource that the python process controls. In libcudf, we can always pass a memory resource in to any allocating routines, so if we do that from the python side with a memory resource whose lifetime we control, we can match the required contract and ensure that we do things safely.
Right now, everything works through happenstance.
from cudf.
I've opened rapidsai/rmm#1492 to clarify the requirements in docs on the RMM side.
from cudf.
Right, the mr discussion is in #14229.
Playing devil's advocate: while the pitfalls of default mrs are clear, that also makes the overall API very cumbersome when you have a libcudf API with 7 parameters (including stream/mr) of which 5 are defaulted because there are sensible choices. This would force users to also provide the other 3 in addition to stream/mr.
If we restrict the "no defaults" choice to only APIs then it would still allow us to make some improvements like those suggested in #15130, but if we disallowed defaults everywhere it could have additional impacts on how much syntactic sugar we can add to the pylibcudf API in a performant manner.
from cudf.
I think I am advocating only that the wrapping of the libcudf API provides no defaults. I would alternately be happy if the cpdef
functions took keyword defaulted arguments for stream and mr (so that one can provide, say, the mr but not the stream if one wants)
from cudf.
Related Issues (20)
- [FEA] Migrate left join and conditional join benchmarks to use nvbench
- [BUG] [JNI] `CudaTest.testCudaException` will not throw `cudaErrorInvalidValue` expectedly under certain environment
- [BUG] cudf.Series should accept None values when nan_as_null=False
- I am not able to install cudf with Cuda12.4 python 3.11.7 driver = 551 [BUG] HOT 1
- [BUG] `Index.get_loc` is returning incorrect results on index objects that are in decreasing order
- [BUG] cmake fails to configure static libcudf due to arrow issues
- [BUG] `df.loc` needs to return index of same types
- [BUG] `df.loc` drops index labels during assignment
- [BUG] `Index.repeat` is failing for `DatetimeIndex` with a frequency
- [FEA] Implement new test organization in cuDF
- [FEA] Disable fallback in cudf.pandas on request
- [BUG] CMake Error "The required target arrow_compute is not in any export set when calling with target arrow_static" HOT 3
- [QST] aggregate function that operates on vector(array of numeric) data
- [BUG] `loc` returning incorrect results for `DatetimeIndex` that is in monotonically decreasing
- [BUG] chunked parquet reader is not factoring empty dataframes with `>0` columns present HOT 2
- [FEA] Make line terminator sequence handling in regular expression engine a configurable option HOT 1
- [BUG] cudf.pandas dataframe.__repr__ slow in jupyterlab for large datasets HOT 1
- [BUG] iloc/loc keeps circular reference to original DataFrame/Series
- [BUG] double free or memory corruption when parsing some JSON HOT 6
- [BUG] Stop allowing floating arrow along minor versions
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cudf.