Giter VIP home page Giter VIP logo

nways_accelerated_programming's People

Contributors

aswinkumar1999 avatar bdudleback avatar bharatk-parallel avatar mozhgan-kch avatar programmah avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

nways_accelerated_programming's Issues

Issue: EuroCC2 and DiRAC Bootcamps Issues

  • Regarding RDF calculation, you have to scan all the pairs of atoms there, but as it's implemented it:

      1. overcounts ( r(ij)=r(ji) )
      1. counts self-distance, which is 0. Is it done on purpose? I'd do it like do i = 1, N-1; do j = i+1, N
  • Regarding the challenge, the OpenACC/C solution has #pragma acc parallel loop collapse(2) default(present)
    for (i = 1; i <= m; i++)
    {
    for (j = 1; j <= n; j++)
    {
    tmp = newarr[i * (m + 2) + j] - oldarr[i * (m + 2) + j];
    dsq += tmp * tmp;
    }
    }

    Shouldn't dsq be declared as reduction(+:dsq)?

  • Challenge lab - The instructions for running the code is not clear, the block for running the serial code only works before you start editing the code - there is confusion when trying to get the serial run timing.

  • Text in notebooks should be updated concerning qdrep (nsys-rep). Text and execution cells should be updated concerning -ta for opanacc compilation.

  • The HPC SDK is old and should be updated to the latest version (NVIDIA HPC SDK Version 24.3)

  • Downloading profiling report and viewing on the installed nsight system GUI kind of disconnect participants from the notebooks. It is possible to install to run nsight system within the Jupyter Notebook as shown here: link

  • In ISO notebook, section "Compile for TESLA GPU (ISO C++)" the cell only compiles the code but doesn’t run it - whilst the lab does say "Make sure to validate the output by running the executable", I think it would be easier if that was just part of the command in the cell, similar to the earlier cells in the notebook.

  • one of the solutions (OpenACC, C_++) collapse_rdf has a "#pragma acc" on the inner loop which shouldn't be there - it actually won't compile with this in it.

  • The OpenACC notebook mentions avoiding multiple loops in a parallel region without context - would be nice to have a "why" associated with this.

Issue: Typo in Python Notebook

Cupy: Exercise 4: Is the expected output correct? This does not match the output of my solutions to Exercises 3 and 4.

Issue: Typo in Python Notebook

Cupy:
Example 6, Step 3 - typo. Should be 'set reduction expression a + b' or 'set reduction expression for a and b'. The ampersand (&) is an operator in Python, so it is confusing to use that here to denote 'and'.

Feature Request: Modifications to Python Notebooks

Following are a list of suggested changes to the Python Nways materials as suggested by Robert Searles and Jonathan Dursi

JIT kernels
• Can we move this before CUDA kernels?
• Maybe add Numba Vectorize as an introduction? the following flow: Vectorize -> JIT -> CuPy CUDA makes more sense than CuPy CUDA -> JIT
• In fact, is the order of cupy then numba the right way to go? Can we flip those sections?

Numba notebook:

Exercise 1
• Again, exercise is too easy; students will just copy and paste. Could we make them change it to float, and multiply? Or some slightly deeper change?

Thread re-use - this comes out of nowhere

Matrix multiply:
• Same idea, could we do a naïve matrix transpose instead?

Numba vectorize/ufuncs
• This seems out of place. It doesn't make sense to me to have this come before Numba CUDA kernels and interrupting the flow between numba cuda kernels and atomics

Atomic
• It would be nice if the atomic example for a reduction built on an earlier example, say calculating average matrix element after the multiplication or something

Feature Request - Extension to Nways content in CFD domain

Create the replica of the current Nways to GPU programming content with CFD example using miniweather example. Nways content is available on Github(https://github.com/openhackathons-org/nways_accelerated_programming) . Miniweather example is available on Github(https://github.com/openhackathons-org/gpubootcamp/tree/master/hpc/miniprofiler) To complement the existing content, contribute to one of the below:

  • Nways – OpenACC, OpenMP offloading, CUDA, ISO languages in C/Fortran (Code + Notebook)
  • Nways – using CuPy and Numba in python (Code + Notebook)

Feature Request - Extension to Nways Python content

To extend the current Nways content, create a version using Python cuNumeric, Legate (https://github.com/nv-legate/legate.core), and OpenAI Triton (https://openai.com/research/triton). This will extend to the current Nways Bootcamp with Python, which uses CuPy and Numba (available at https://github.com/openhackathons-org/nways_accelerated_programming/tree/main/_basic/python ). Use the existing RDF code as a starting code. The application must be profiled and assessed at each step, similar to Numba and CuPy versions. You are welcome to choose one of these.

Issue: Typo in Python Notebook

Cupy:
Example 1 has a typo. Should be cuda.Device(0). Since we only have one MiG instance, using Device(1) will throw an error.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.