Giter VIP home page Giter VIP logo

cuda-clocking's Introduction

cuda-clocking

Making it easy to add and analyze clockings in CUDA

example

First run pip install tabulate; then run ./run.sh and enjoy the pretty readout!

usage

The only include is clocker.cuh

cuda-clocking currently consists of 5 kernel macros and one host struct. First the macros:

  1. Add TIMINGDETAIL_ARGS() to the parameters of the kernel you want to profile.
  2. Run INITTHREADTIMER(); at the start of the kernel you want to profile. If you need more than 64 breakpoints, then you can pass in the number you need like INITTHREADTIMER(200);
  3. Run CLOCKRESET(); whenever you want to restart the running timer.
  4. Run CLOCKPOINT(ID, LABEL); whenever you want to record elapsed time and restart the timer. You can put it in a loop, and it will count both calls and elapsed time. ID should be unique, >=0, and <(max number of breakpoints; default 64). The LABEL field is completely ignored by the compiler but is used by the analysis script. A valid example use would be CLOCKPOINT(3, "global memory write");
  5. Run FINISHTHREADTIMER(); at the end of the kernel to write results back to global memory.

The host struct TimingData should be initialized with the grid and block dimensions of the kernel, and also the requested number of breakpoints if more than 64. (Otherwise that parameter can be omitted). Pass in timingdata_struct_name.data as the argument corresponding to TIMINGDETAIL_ARGS() when running the kernel. Finally, call timingdata_struct_name.write("path_to_cuda_file_containing_kernel"); to write an output profile.

You should be able to compile and run your code normally. The timing does use a relatively small number of registers but it is unlikely to interfere with most kernels.

Finally, run python analysis.py to see the results!

cuda-clocking's People

Contributors

benjaminfspector avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.