Giter VIP home page Giter VIP logo

Comments (5)

davidwendt avatar davidwendt commented on August 25, 2024

I'd like us to consider an alternate libcudf implementation that is more work but may be better in terms of control and maintenance going forward. I believe we could build a set of utilities that accept pointers or a variety of container types that perform the prefetch and then insert the prefetch/utility calls before each kernel launch. This provides the best control to the algorithm author when and what is prefetched with no surprises or side-effects.

I'd like to keep logic like this out of the containers (column_view and device_uvector). I feel these introduce hidden side-effects that would be difficult to avoid similar to the lazy-null-count logic that was removed several releases ago. I know this is more work but I think having the logic inline with the kernel launches will be easier to maintain and control. We can easily decide which algorithms need prefetching (and when , how, and which parts) and iteratively work on specific chunking solutions in the future without effecting all the other APIs.

from cudf.

vyasr avatar vyasr commented on August 25, 2024

I concur with your assessment long term, but as detailed in the issue I don't think it is feasible on the timeline we are seeking. Inserting changes before every kernel launch, even fairly trivial changes, seems like a task that will take at least one full release since the initial work will require achieving consensus on what those changes should be.

Is there something I wrote in the issue that you disagree with? I tried to address pretty much this exact concern in the issue since I share it and anticipated that others would raise it at this point.

from cudf.

davidwendt avatar davidwendt commented on August 25, 2024

Is there something I wrote in the issue that you disagree with? I tried to address pretty much this exact concern in the issue since I share it and anticipated that others would raise it at this point.

I only disagree with modifying column_view and subclassing device_uvector even in the short term. The first makes me uneasy for the codebase because of its global nature. It likely will not hit all the desired code paths and may cause unnecessary prefetching in other cases (causing more workarounds, etc). The subclassed device_uvector requires a wide change to the codebase on a similar scale that I was proposing so it does not save us that much work.

I'm was hoping that we can add prefetch to a few APIs quickly using a targeted approach with a handful of utilities in the short term and then roll out the rest in the long term.

from cudf.

vyasr avatar vyasr commented on August 25, 2024

I'm was hoping that we can add prefetch to a few APIs quickly using a targeted approach with a handful of utilities in the short term and then roll out the rest in the long term.

The problem I see with that approach is that while we might be able to see good results on a particular set of benchmarks that way, we will not be able to enable a managed memory resource as default without substantially slowing down a wide range of APIs (anything that doesn't have prefetching enabled). We should at minimum test running the cudf microbenchmarks with a managed memory resource. I suspect that the results will not support using a managed memory resource by default in cudf.pandas without the more blanket approach for prefetching, unless we choose to wait for the longer term solution where we roll out your proposed changes to more APIs.

from cudf.

vyasr avatar vyasr commented on August 25, 2024

Copying from Slack:

We came to the following compromise during the discussion:

  • We will merge the column_view/mutable_column_view changes from #16020 to allow prefetching to occur on the widest set of APIs possible in the short term.
  • We will not merge the device_uvector changes because that requires touching many places. Instead, we will find everywhere that we would need to make such changes, and instead insert manual prefetch calls like in #16265. Since that is the long term solution that we prefer anyway, we should do that instead of changing device_uvector since it's the same number of places that need changing. My hope would be that in the short term these would all be prefetches on device_uvectors or device_buffers, the places where we know the above solution has no effect
  • We will include the prefetch allocator
  • We will keep the configuration options in place
  • Over the course of the next couple of months, we will run libcudf benchmarks and cudf Python microbenchmarks using managed memory and identify hot spots that need manual prefetching added. As we do this, we will turn off the column_view prefetching so that we ensure that we're capturing all of the same needs. Once we are satisfied, we will remove prefetching from column_view

I'm going to work on updating 16020 today to remove the undesirable changes, then David and I will aim to get his changes merged in tomorrow

from cudf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.