This seems like a great app (which BTW NVIDIA itself should have really released itself independently of the source view in NSight Compute). However - it is limited to a single, specific C++ IDE.
Would it be possible for you to break this dependency?
Specifrically, if you could write a valid CMakeLists.txt for CudaPAD, that would do the trick. (And CMake can generate SLN files, so it shouldn't really hamper the user of MSVS to build it).
This seems like a great app (which BTW NVIDIA itself should have really released itself independently of the source view in NSight Compute). However - it is limited to a single operating system (and requires a specific C++ IDE, see #4).
Would it be possible for you to break this dependency?
I realize that is a rather tall order for C#, which is rather Windows-centric, so this might be a disguised request to port this to another language, but it would be what I would need to use CudaPAD.
Your source seems to run well against the 9.1 nvcc toolkit. I see that you had mentioned in your readme that you'd like, one day, to be able run the code against a card to get timing results. That would be helpful but I can imagine would take a bit of work.
I was thinking that it also might be useful to display the clock cycles required for each operation in the PTX or SASS. I was wondering if you are aware of any reference or source for this type of information?