mrnorman / miniweather Goto Github PK

View Code? Open in Web Editor NEW

137.0 137.0 62.0 8.41 MB

A parallel programming training mini app simulating weather-like flows

License: Other

Fortran 40.98% C++ 45.90% CMake 2.72% Shell 6.22% Julia 4.06% Makefile 0.12%

miniweather's People

Contributors

Stargazers

Watchers

miniweather's Issues

Is there any information about performance about this program?

Hello, I'm porting YAKL to a new platform, and I want to use this project to measure the performance of my own port of YAKL. Due to limitation of our hardware resource, we cannot get proper performance information about running this program on a cluster. Could you provide some information about performance of this program such as the total time consumption of this program on some supported platforms you listed in build script like summit/thatchroof or other platforms.

2D MPI

Do you have an implementation of a 2D domain decomposition in MPI, i.e X and Z decomposed in MPI?

Question on simple test running on GPUs.

When running with YAKL arrays, how do I know that something is actually being computed on the GPU?

I wrote the simplest possible code in order to:

initialize a YAKL array on the GPUs with parallel_for (link to code)
deep_copy array from GPU to CPU (link to code)
alter cpu version of array (link to code)
deep_copy array from CPU back to GPU (link to code)

I added print (std::cout) statements after each of these for both the cpu and gpu arrays. I added to the cpp/CMakeLists.txt file so it is included with the make command.

After compiling, I can simply run on a summit log-in node with ./simple_yakl_tests, and the gpu array initializes correctly. That concerns me, because the parallel_for lines intended for the gpu must just be defaulting to the cpu in this case. When I run on a compute node with

jsrun -n 1 -a 1 -c 1 -g 1 ./simple_yakl_tests

I get the identical output. I assume this is actually running the parallel_for on the gpu, but I really have no idea because the output is identical to the test on the log-in node.

BTW, when I run on the compute node without a gpu:

jsrun -n 1 -a 1 -c 1 ./simple_yakl_tests

the parallel_for initializes the gpu array to zeros. So there is at least some difference.

Overall, I'm trying to develop some experience of what exactly is happening on the GPU, and I don't think I have the proper tools. Are there some other functions or output to help with this? Thanks.

Minor typos

There are duplicated comments mainly because of pasting code, where z-direction second is repeted.

  if (direction_switch) {
    //x-direction first
    semi_discrete_step( state , state     , state_tmp , dt / 3 , DIR_X , flux , tend );
    semi_discrete_step( state , state_tmp , state_tmp , dt / 2 , DIR_X , flux , tend );
    semi_discrete_step( state , state_tmp , state     , dt / 1 , DIR_X , flux , tend );
    //z-direction second
    semi_discrete_step( state , state     , state_tmp , dt / 3 , DIR_Z , flux , tend );
    semi_discrete_step( state , state_tmp , state_tmp , dt / 2 , DIR_Z , flux , tend );
    semi_discrete_step( state , state_tmp , state     , dt / 1 , DIR_Z , flux , tend );
  } else {
    //z-direction second
    semi_discrete_step( state , state     , state_tmp , dt / 3 , DIR_Z , flux , tend );
    semi_discrete_step( state , state_tmp , state_tmp , dt / 2 , DIR_Z , flux , tend );
    semi_discrete_step( state , state_tmp , state     , dt / 1 , DIR_Z , flux , tend );
    //x-direction first
    semi_discrete_step( state , state     , state_tmp , dt / 3 , DIR_X , flux , tend );
    semi_discrete_step( state , state_tmp , state_tmp , dt / 2 , DIR_X , flux , tend );
    semi_discrete_step( state , state_tmp , state     , dt / 1 , DIR_X , flux , tend );
  }

mrnorman / miniweather Goto Github PK

miniweather's People

Contributors

Stargazers

Watchers

Forkers

miniweather's Issues

Is there any information about performance about this program?

2D MPI

Question on simple test running on GPUs.

Minor typos

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent