Giter VIP home page Giter VIP logo

hip-examples's Introduction

HIP-Examples

Deprecation Notice

Please note that AMD will deprecate and archive the hip-examples repository. Please visit rocm-examples, the new home for ROCm examples.

Examples for HIP.

This depot should be extracted into the root directory of an existing HIP depot.

We managed to push the following benchmarks with HIP upstreamed on github:

mixbench and GPU-Stream have been added as submodules for this repository, to fetch data for submodules:

    git submodule init
    git submodule update

hip-examples's People

Contributors

aaronenyeshi avatar agunashe avatar avatat avatar bensander avatar dgaliffiamd avatar emankov avatar gargrahul avatar kasaurov avatar kentrussell avatar mangupta avatar pramenku avatar rburraamd avatar satyanveshd avatar scchan avatar sunway513 avatar whchung avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hip-examples's Issues

[Question]-CMakeLists.txt file to understand compile and Link in HIP environment

Is any reference link or document to understand on "CMakeLists.txt" to compile and link in HCC Environment

Example1:
compile & link: hipcc
hipcc -c example.cpp.cpp -o main.o [compile]
hipcc main.o -o Main [link]

Example 2:

compile(hipcc) and link(g++):
hipcc -c example.cpp -o main.o [compile]
g++ -L/opt/rocm/lib -lhip_hcc main.o -o Main [link]

In above examples , compile and link done through command lines. it is need write in CMakeLists.txt

Matrix Multiplication

The output of the MatrixMultiplication program is printing all 0's but both input matrices are having values.

mini-nbody does not compile

rocm 6.0.2

this repo at ff81239

user@debian:~/HIP-Examples/mini-nbody$ ./shmoo-cpu-nbody.sh 
In file included from nbody.c:4:
timer.h: In function ‘GetTimer’:
timer.h:49:3: warning: implicit declaration of function ‘timersub’ [-Wimplicit-function-declaration]
   49 |   timersub(&timerStop, &timerStart, &timerElapsed);
      |   ^~~~~~~~
/usr/bin/ld: /tmp/ccxoVs01.o: in function `GetTimer':
nbody.c:(.text+0x1f4): undefined reference to `timersub'
/usr/bin/ld: /tmp/ccxoVs01.o: in function `main':
nbody.c:(.text.startup+0x117): undefined reference to `timersub'
collect2: error: ld returned 1 exit status
nbody
./shmoo-cpu-nbody.sh: line 10: ./nbody: No such file or directory
./shmoo-cpu-nbody.sh: line 10: ./nbody: No such file or directory
./shmoo-cpu-nbody.sh: line 10: ./nbody: No such file or directory
./shmoo-cpu-nbody.sh: line 10: ./nbody: No such file or directory
./shmoo-cpu-nbody.sh: line 10: ./nbody: No such file or directory
./shmoo-cpu-nbody.sh: line 10: ./nbody: No such file or directory
./shmoo-cpu-nbody.sh: line 10: ./nbody: No such file or directory
./shmoo-cpu-nbody.sh: line 10: ./nbody: No such file or directory
./shmoo-cpu-nbody.sh: line 10: ./nbody: No such file or directory
./shmoo-cpu-nbody.sh: line 10: ./nbody: No such file or directory

Core dumped when running in performance counter mode

I am experiencing a core dumped error when I am running a simple HIP program with the profiler.

I am running the vector add example from the HIP-Examples repo. After making the executable file. I used /opt/rocm/bin/rocprof --stats vectoradd_hip.exe to profile the program. The program cannot properly terminate and I see this error:

/opt/rocm/bin/rocprof: line 275: 17919 Aborted                 (core dumped) "vectoradd_hip.exe"

Here is my machine configuration:

"name -a":

Linux lowfreq 4.15.0-106-generic #107-Ubuntu SMP Thu Jun 4 11:27:52 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

"lsb_release -a":

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.4 LTS
Release:        18.04
Codename:       bionic

"rocm-info":

�[37mROCk module is loaded�[0m
�[37mAble to open /dev/kfd read-write�[0m
=====================    
HSA System Attributes    
=====================    
Runtime Version:         1.1
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                              
System Endianness:       LITTLE                             

==========               
HSA Agents               
==========               
*******                  
Agent 1                  
*******                  
  Name:                    Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
  Uuid:                    CPU-XX                             
  Marketing Name:          Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
  Vendor Name:             CPU                                
  Feature:                 None specified                     
  Profile:                 FULL_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        0(0x0)                             
  Queue Min Size:          0(0x0)                             
  Queue Max Size:          0(0x0)                             
  Queue Type:              MULTI                              
  Node:                    0                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      32768(0x8000) KB                   
  Chip ID:                 0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   3100                               
  BDFID:                   0                                  
  Internal Node ID:        0                                  
  Compute Unit:            20                                 
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Features:                None
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    65878732(0x3ed3acc) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    65878732(0x3ed3acc) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
  ISA Info:                
    N/A                      
*******                  
Agent 2                  
*******                  
  Name:                    Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
  Uuid:                    CPU-XX                             
  Marketing Name:          Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
  Vendor Name:             CPU                                
  Feature:                 None specified                     
  Profile:                 FULL_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        0(0x0)                             
  Queue Min Size:          0(0x0)                             
  Queue Max Size:          0(0x0)                             
  Queue Type:              MULTI                              
  Node:                    1                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      32768(0x8000) KB                   
  Chip ID:                 0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   3100                               
  BDFID:                   0                                  
  Internal Node ID:        1                                  
  Compute Unit:            20                                 
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Features:                None
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    66033596(0x3ef97bc) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    66033596(0x3ef97bc) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
  ISA Info:                
    N/A                      
*******                  
Agent 3                  
*******                  
  Name:                    gfx803                             
  Uuid:                    GPU-XX                             
  Marketing Name:          Fiji [Radeon R9 FURY / NANO Series]
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          4096(0x1000)                       
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    2                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB                        
  Chip ID:                 29440(0x7300)                      
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   1000                               
  BDFID:                   33280                              
  Internal Node ID:        2                                  
  Compute Unit:            64                                 
  SIMDs per CU:            4                                  
  Shader Engines:          4                                  
  Shader Arrs. per Eng.:   1                                  
  WatchPts on Addr. Ranges:4                                  
  Features:                KERNEL_DISPATCH 
  Fast F16 Operation:      FALSE                              
  Wavefront Size:          64(0x40)                           
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        40(0x28)                           
  Max Work-item Per CU:    2560(0xa00)                        
  Grid Max Size:           4294967295(0xffffffff)             
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)             
    y                        4294967295(0xffffffff)             
    z                        4294967295(0xffffffff)             
  Max fbarriers/Workgrp:   32                                 
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    4194304(0x400000) KB               
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                   
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx803          
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                   
      Default Rounding Mode:   NEAR                               
      Default Rounding Mode:   NEAR                               
      Fast f16:                TRUE                               
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      FBarrier Max Size:       32                                 
*******                  
Agent 4                  
*******                  
  Name:                    gfx803                             
  Uuid:                    GPU-XX                             
  Marketing Name:          Fiji [Radeon R9 FURY / NANO Series]
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          4096(0x1000)                       
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    3                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB                        
  Chip ID:                 29440(0x7300)                      
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   1000                               
  BDFID:                   33536                              
  Internal Node ID:        3                                  
  Compute Unit:            64                                 
  SIMDs per CU:            4                                  
  Shader Engines:          4                                  
  Shader Arrs. per Eng.:   1                                  
  WatchPts on Addr. Ranges:4                                  
  Features:                KERNEL_DISPATCH 
  Fast F16 Operation:      FALSE                              
  Wavefront Size:          64(0x40)                           
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        40(0x28)                           
  Max Work-item Per CU:    2560(0xa00)                        
  Grid Max Size:           4294967295(0xffffffff)             
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)             
    y                        4294967295(0xffffffff)             
    z                        4294967295(0xffffffff)             
  Max fbarriers/Workgrp:   32                                 
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    4194304(0x400000) KB               
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                   
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx803          
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                   
      Default Rounding Mode:   NEAR                               
      Default Rounding Mode:   NEAR                               
      Fast f16:                TRUE                               
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      FBarrier Max Size:       32                                 
*** Done ***             

Compilation failure with nvcc backend

I'm getting these errors when I try to compile an example with the HIP nvcc backend. What is the problem?

$ cd add4
$ make
/opt/rocm/hip/bin/hipcc -std=c++11 -O3 -c hip-stream.cpp -o hip-stream.o
In file included from /opt/rocm/hip/include/hip/hip_runtime_api.h:368,
                 from /opt/rocm/hip/include/hip/nvcc_detail/hip_runtime.h:28,
                 from /opt/rocm/hip/include/hip/hip_runtime.h:58,
                 from hip-stream.cpp:1:
/opt/rocm/hip/include/hip/nvcc_detail/hip_runtime_api.h: In function ‘hipError_t hipPointerGetAttributes(hipPointerAttribute_t*, const void*)’:
/opt/rocm/hip/include/hip/nvcc_detail/hip_runtime_api.h:1355:21: error: ‘struct cudaPointerAttributes’ has no member named ‘memoryType’
 1355 |         switch (cPA.memoryType) {
      |                     ^~~~~~~~~~
hip-stream.cpp: In function ‘void copy_looper(const T*, T*, int)’:
hip-stream.cpp:74:19: error: ‘hipBlockIdx_x’ was not declared in this scope
   74 |     int offset = (hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x)*CLUMP_SIZE;
      |                   ^~~~~~~~~~~~~
hip-stream.cpp:74:35: error: ‘hipBlockDim_x’ was not declared in this scope
   74 |     int offset = (hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x)*CLUMP_SIZE;
      |                                   ^~~~~~~~~~~~~
hip-stream.cpp:74:51: error: ‘hipThreadIdx_x’ was not declared in this scope
   74 |     int offset = (hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x)*CLUMP_SIZE;
      |                                                   ^~~~~~~~~~~~~~
hip-stream.cpp:75:34: error: ‘hipGridDim_x’ was not declared in this scope
   75 |     int stride = hipBlockDim_x * hipGridDim_x * CLUMP_SIZE;
      |                                  ^~~~~~~~~~~~
hip-stream.cpp: In function ‘void mul_looper(T*, const T*, int)’:
hip-stream.cpp:86:18: error: ‘hipBlockIdx_x’ was not declared in this scope
   86 |     int offset = hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x;
      |                  ^~~~~~~~~~~~~
hip-stream.cpp:86:34: error: ‘hipBlockDim_x’ was not declared in this scope
   86 |     int offset = hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x;
      |                                  ^~~~~~~~~~~~~
hip-stream.cpp:86:50: error: ‘hipThreadIdx_x’ was not declared in this scope
   86 |     int offset = hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x;
      |                                                  ^~~~~~~~~~~~~~
hip-stream.cpp:87:34: error: ‘hipGridDim_x’ was not declared in this scope
   87 |     int stride = hipBlockDim_x * hipGridDim_x;
      |                                  ^~~~~~~~~~~~
hip-stream.cpp: In function ‘void add_looper(const T*, const T*, const T*, const T*, T*, int)’:
hip-stream.cpp:99:18: error: ‘hipBlockIdx_x’ was not declared in this scope
   99 |     int offset = hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x;
      |                  ^~~~~~~~~~~~~
hip-stream.cpp:99:34: error: ‘hipBlockDim_x’ was not declared in this scope
   99 |     int offset = hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x;
      |                                  ^~~~~~~~~~~~~
hip-stream.cpp:99:50: error: ‘hipThreadIdx_x’ was not declared in this scope
   99 |     int offset = hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x;
      |                                                  ^~~~~~~~~~~~~~
hip-stream.cpp:100:34: error: ‘hipGridDim_x’ was not declared in this scope
  100 |     int stride = hipBlockDim_x * hipGridDim_x;
      |                                  ^~~~~~~~~~~~
hip-stream.cpp: In function ‘void triad_looper(T*, const T*, const T*, int)’:
hip-stream.cpp:111:18: error: ‘hipBlockIdx_x’ was not declared in this scope
  111 |     int offset = hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x;
      |                  ^~~~~~~~~~~~~
hip-stream.cpp:111:34: error: ‘hipBlockDim_x’ was not declared in this scope
  111 |     int offset = hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x;
      |                                  ^~~~~~~~~~~~~
hip-stream.cpp:111:50: error: ‘hipThreadIdx_x’ was not declared in this scope
  111 |     int offset = hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x;
      |                                                  ^~~~~~~~~~~~~~
hip-stream.cpp:112:34: error: ‘hipGridDim_x’ was not declared in this scope
  112 |     int stride = hipBlockDim_x * hipGridDim_x;
      |                                  ^~~~~~~~~~~~
hip-stream.cpp: In function ‘void copy(const T*, T*)’:
hip-stream.cpp:127:19: error: ‘hipBlockDim_x’ was not declared in this scope
  127 |     const int i = hipBlockDim_x * hipBlockIdx_x + hipThreadIdx_x;
      |                   ^~~~~~~~~~~~~
hip-stream.cpp:127:35: error: ‘hipBlockIdx_x’ was not declared in this scope
  127 |     const int i = hipBlockDim_x * hipBlockIdx_x + hipThreadIdx_x;
      |                                   ^~~~~~~~~~~~~
hip-stream.cpp:127:51: error: ‘hipThreadIdx_x’ was not declared in this scope
  127 |     const int i = hipBlockDim_x * hipBlockIdx_x + hipThreadIdx_x;
      |                                                   ^~~~~~~~~~~~~~
hip-stream.cpp: In function ‘void mul(T*, const T*)’:
hip-stream.cpp:137:19: error: ‘hipBlockDim_x’ was not declared in this scope
  137 |     const int i = hipBlockDim_x * hipBlockIdx_x + hipThreadIdx_x;
      |                   ^~~~~~~~~~~~~
hip-stream.cpp:137:35: error: ‘hipBlockIdx_x’ was not declared in this scope
  137 |     const int i = hipBlockDim_x * hipBlockIdx_x + hipThreadIdx_x;
      |                                   ^~~~~~~~~~~~~
hip-stream.cpp:137:51: error: ‘hipThreadIdx_x’ was not declared in this scope
  137 |     const int i = hipBlockDim_x * hipBlockIdx_x + hipThreadIdx_x;
      |                                                   ^~~~~~~~~~~~~~
hip-stream.cpp: In function ‘void add(const T*, const T*, const T*, const T*, T*)’:
hip-stream.cpp:145:19: error: ‘hipBlockDim_x’ was not declared in this scope
  145 |     const int i = hipBlockDim_x * hipBlockIdx_x + hipThreadIdx_x;
      |                   ^~~~~~~~~~~~~
hip-stream.cpp:145:35: error: ‘hipBlockIdx_x’ was not declared in this scope
  145 |     const int i = hipBlockDim_x * hipBlockIdx_x + hipThreadIdx_x;
      |                                   ^~~~~~~~~~~~~
hip-stream.cpp:145:51: error: ‘hipThreadIdx_x’ was not declared in this scope
  145 |     const int i = hipBlockDim_x * hipBlockIdx_x + hipThreadIdx_x;
      |                                                   ^~~~~~~~~~~~~~
hip-stream.cpp: In function ‘void triad(T*, const T*, const T*)’:
hip-stream.cpp:154:19: error: ‘hipBlockDim_x’ was not declared in this scope
  154 |     const int i = hipBlockDim_x * hipBlockIdx_x + hipThreadIdx_x;
      |                   ^~~~~~~~~~~~~
hip-stream.cpp:154:35: error: ‘hipBlockIdx_x’ was not declared in this scope
  154 |     const int i = hipBlockDim_x * hipBlockIdx_x + hipThreadIdx_x;
      |                                   ^~~~~~~~~~~~~
hip-stream.cpp:154:51: error: ‘hipThreadIdx_x’ was not declared in this scope
  154 |     const int i = hipBlockDim_x * hipBlockIdx_x + hipThreadIdx_x;
      |                                                   ^~~~~~~~~~~~~~
In file included from /opt/rocm/hip/include/hip/hip_runtime.h:58,
                 from hip-stream.cpp:1:
hip-stream.cpp: In function ‘int main(int, char**)’:
hip-stream.cpp:340:17: error: expected primary-expression before ‘<’ token
  340 |                 hipLaunchKernelGGL((copy_looper<float,1>), dim3(gridSize), dim3(groupSize), 0, 0, (float*)d_a, (float*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:340:17: error: expected primary-expression before ‘>’ token
  340 |                 hipLaunchKernelGGL((copy_looper<float,1>), dim3(gridSize), dim3(groupSize), 0, 0, (float*)d_a, (float*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:342:17: error: expected primary-expression before ‘<’ token
  342 |                 hipLaunchKernelGGL((copy_looper<double,1>), dim3(gridSize), dim3(groupSize), 0, 0, (double*)d_a, (double*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:342:17: error: expected primary-expression before ‘>’ token
  342 |                 hipLaunchKernelGGL((copy_looper<double,1>), dim3(gridSize), dim3(groupSize), 0, 0, (double*)d_a, (double*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:345:17: error: expected primary-expression before ‘<’ token
  345 |                 hipLaunchKernelGGL(copy<float>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (float*)d_a, (float*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:345:17: error: expected primary-expression before ‘>’ token
  345 |                 hipLaunchKernelGGL(copy<float>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (float*)d_a, (float*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:347:17: error: expected primary-expression before ‘<’ token
  347 |                 hipLaunchKernelGGL(copy<double>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (double*)d_a, (double*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:347:17: error: expected primary-expression before ‘>’ token
  347 |                 hipLaunchKernelGGL(copy<double>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (double*)d_a, (double*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:359:17: error: expected primary-expression before ‘<’ token
  359 |                 hipLaunchKernelGGL(mul_looper<float>, dim3(gridSize), dim3(groupSize), 0, 0, (float*)d_b, (float*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:359:17: error: expected primary-expression before ‘>’ token
  359 |                 hipLaunchKernelGGL(mul_looper<float>, dim3(gridSize), dim3(groupSize), 0, 0, (float*)d_b, (float*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:361:17: error: expected primary-expression before ‘<’ token
  361 |                 hipLaunchKernelGGL(mul_looper<double>, dim3(gridSize), dim3(groupSize), 0, 0, (double*)d_b, (double*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:361:17: error: expected primary-expression before ‘>’ token
  361 |                 hipLaunchKernelGGL(mul_looper<double>, dim3(gridSize), dim3(groupSize), 0, 0, (double*)d_b, (double*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:364:17: error: expected primary-expression before ‘<’ token
  364 |                 hipLaunchKernelGGL(mul<float>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (float*)d_b, (float*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:364:17: error: expected primary-expression before ‘>’ token
  364 |                 hipLaunchKernelGGL(mul<float>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (float*)d_b, (float*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:366:17: error: expected primary-expression before ‘<’ token
  366 |                 hipLaunchKernelGGL(mul<double>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (double*)d_b, (double*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:366:17: error: expected primary-expression before ‘>’ token
  366 |                 hipLaunchKernelGGL(mul<double>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (double*)d_b, (double*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:378:17: error: expected primary-expression before ‘<’ token
  378 |                 hipLaunchKernelGGL(add_looper<float>, dim3(gridSize), dim3(groupSize), 0, 0, (float*)d_a, (float*)d_b,(float*)d_d, (float*)d_e,  (float*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:378:17: error: expected primary-expression before ‘>’ token
  378 |                 hipLaunchKernelGGL(add_looper<float>, dim3(gridSize), dim3(groupSize), 0, 0, (float*)d_a, (float*)d_b,(float*)d_d, (float*)d_e,  (float*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:380:17: error: expected primary-expression before ‘<’ token
  380 |                 hipLaunchKernelGGL(add_looper<double>, dim3(gridSize), dim3(groupSize), 0, 0, (double*)d_a, (double*)d_b, (double*)d_d, (double*)d_e,(double*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:380:17: error: expected primary-expression before ‘>’ token
  380 |                 hipLaunchKernelGGL(add_looper<double>, dim3(gridSize), dim3(groupSize), 0, 0, (double*)d_a, (double*)d_b, (double*)d_d, (double*)d_e,(double*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:383:17: error: expected primary-expression before ‘<’ token
  383 |                 hipLaunchKernelGGL(add<float>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (float*)d_a, (float*)d_b, (float*)d_d,(float*)d_e,(float*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:383:17: error: expected primary-expression before ‘>’ token
  383 |                 hipLaunchKernelGGL(add<float>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (float*)d_a, (float*)d_b, (float*)d_d,(float*)d_e,(float*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:385:17: error: expected primary-expression before ‘<’ token
  385 |                 hipLaunchKernelGGL(add<double>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (double*)d_a, (double*)d_b, (double*)d_d,(double*)d_e,(double*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:385:17: error: expected primary-expression before ‘>’ token
  385 |                 hipLaunchKernelGGL(add<double>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (double*)d_a, (double*)d_b, (double*)d_d,(double*)d_e,(double*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:397:17: error: expected primary-expression before ‘<’ token
  397 |                 hipLaunchKernelGGL(triad_looper<float>, dim3(gridSize), dim3(groupSize), 0, 0, (float*)d_a, (float*)d_b, (float*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:397:17: error: expected primary-expression before ‘>’ token
  397 |                 hipLaunchKernelGGL(triad_looper<float>, dim3(gridSize), dim3(groupSize), 0, 0, (float*)d_a, (float*)d_b, (float*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:399:17: error: expected primary-expression before ‘<’ token
  399 |                 hipLaunchKernelGGL(triad_looper<double>, dim3(gridSize), dim3(groupSize), 0, 0, (double*)d_a, (double*)d_b, (double*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:399:17: error: expected primary-expression before ‘>’ token
  399 |                 hipLaunchKernelGGL(triad_looper<double>, dim3(gridSize), dim3(groupSize), 0, 0, (double*)d_a, (double*)d_b, (double*)d_c, ARRAY_SIZE);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:402:17: error: expected primary-expression before ‘<’ token
  402 |                 hipLaunchKernelGGL(triad<float>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (float*)d_a, (float*)d_b, (float*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:402:17: error: expected primary-expression before ‘>’ token
  402 |                 hipLaunchKernelGGL(triad<float>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (float*)d_a, (float*)d_b, (float*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:404:17: error: expected primary-expression before ‘<’ token
  404 |                 hipLaunchKernelGGL(triad<double>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (double*)d_a, (double*)d_b, (double*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
hip-stream.cpp:404:17: error: expected primary-expression before ‘>’ token
  404 |                 hipLaunchKernelGGL(triad<double>, dim3(ARRAY_SIZE/groupSize), dim3(groupSize), 0, 0, (double*)d_a, (double*)d_b, (double*)d_c);
      |                 ^~~~~~~~~~~~~~~~~~
make: *** [Makefile:14: hip-stream.o] Error 1

My hipconfig output is:

HIP version  : 3.7.

== hipconfig
HIP_PATH     : /opt/rocm/hip
ROCM_PATH    : /opt/rocm
HIP_COMPILER : clang
HIP_PLATFORM : nvcc
HIP_RUNTIME  : CUDA
CPP_CONFIG   :  -D__HIP_PLATFORM_NVCC__=  -I/opt/rocm/hip/include -I/opt/cuda/include

== nvcc
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0

=== Environment Variables
PATH=[censored]
CUDA_CACHE_PATH=[censored]
CUDA_PATH=/opt/cuda
LD_LIBRARY_PATH=[censored]

== Linux Kernel
Hostname     : [censored]
Linux [censored] 5.7.10-arch1-1 #1 SMP PREEMPT Wed, 22 Jul 2020 19:57:42 +0000 x86_64 GNU/Linux

Unknown argument '-hc' found while linking to hip::device

I was trying to run the example openmp-helloworld with HIP 3.9.1 and an error occurred.

The compiler I was using is not in the default installation path /opt/rocm so I commented out the compiler setting statements in the original CMakeLists.txt. It seems like the target hip::device cmake found is still from the default installation of ROCm. This target comes with a flag -hc which breaks the build.

$ cmake -S . -B build -DCMAKE_C_COMPILER=hipcc -DCMAKE_CXX_COMPILER=hipcc && cmake --build build -v

[ 50%] Building CXX object CMakeFiles/test_openmp_helloworld.dir/openmp_helloworld.cpp.o                                                                    
/public/software/compiler/rocm/rocm-3.9.1/bin/hipcc   -isystem /opt/rocm/hip/include -isystem /opt/rocm/hsa/include  -O3 -DNDEBUG   -fopenmp=libomp -hc -fPIC -o CMakeFiles/test_openmp_helloworld.dir/openmp_helloworld.cpp.o -c /**/openmp-helloworld/openmp_helloworld.cpp          
clang-12: error: unknown argument: '-hc'

###Segfault with hipcc

We tried to run and execute a sample example using hipcc compiler in an AWS instance.
The sample fails to run with a Segmentation fault.

Any suggestions would be helpful

Steps followed

1. Clone the HIP examples
git clone https://github.com/ROCm-Developer-Tools/HIP-Examples.git

2. Export Variables

export ROCM_PATH=/opt/rocm
export PATH=/opt/rocm/bin:$PATH 
 export LD_LIBRARY_PATH=/opt/rocm-5.0.0/hsa/lib:$LD_LIBRARY_PATH
export HIPCC_COMPILE_FLAGS_APPEND="-save-temps"
export HIPCC_VERBOSE=7

3. Sample test Execution

ubuntu@ip-172-31-16-166:~/HIP/examples$ cd HIP-Examples/vectorAdd/
ubuntu@ip-172-31-16-166:~/HIP/examples/HIP-Examples/vectorAdd$
ubuntu@ip-172-31-16-166:~/HIP/hip-testsuite/HIP-Examples/vectorAdd$ make
./vectoradd_hip.exe
 System minor 0
 System major 20
 agent prop name
hip Device prop succeeded
Makefile:30: recipe for target 'test' failed
make: *** [test] Segmentation fault (core dumped)

4. Try to execute sample test via command line

ubuntu@ip-172-31-16-166:~/HIP/hip-testsuite/HIP-Examples/vectorAdd$ hipcc  vectoradd_hip.cpp
HIP_PATH=/opt/rocm/hip
HIP_PLATFORM=amd
HIP_COMPILER=clang
HIP_RUNTIME=rocclr
ROCM_PATH=/opt/rocm
HIP_ROCCLR_HOME=/opt/rocm/hip
HIP_CLANG_PATH=/opt/rocm/llvm/bin
HIP_CLANG_INCLUDE_PATH=/opt/rocm-5.0.0/llvm/lib/clang/14.0.0/include
HIP_INCLUDE_PATH=/opt/rocm/hip/include
HIP_LIB_PATH=/opt/rocm/hip/lib
DEVICE_LIB_PATH=/opt/rocm/amdgcn/bitcode
HIP_CLANG_TARGET=1
hipcc-args: vectoradd_hip.cpp
hipcc-cmd: /opt/rocm/llvm/bin/clang++  -std=c++11 -isystem "/opt/rocm-5.0.0/llvm/lib/clang/14.0.0/include/.." -isystem /opt/rocm/hsa/include -isystem "/opt/rocm/hip/include" --offload-arch=gfx1011 -O3 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -fhip-new-launch-api -save-temps  -L"/opt/rocm/hip/lib" -O3 -lgcc_s -lgcc -lpthread -lm -lrt  -Wl,--enable-new-dtags -Wl,-rpath=/opt/rocm/hip/lib:/opt/rocm/lib -lamdhip64  -x hip vectoradd_hip.cpp -L/opt/rocm/llvm/bin/../lib/clang/14.0.0/lib/linux -lclang_rt.builtins-x86_64

3. Run executable

ubuntu@ip-172-31-16-166:~/HIP/hip-testsuite/HIP-Examples/vectorAdd$ ls
Makefile                                         vectoradd_hip-hip-amdgcn-amd-amdhsa-gfx1011.out                 vectoradd_hip-host-x86_64-unknown-linux-gnu.s
README                                           vectoradd_hip-hip-amdgcn-amd-amdhsa-gfx1011.out.resolution.txt  vectoradd_hip.cpp
a.out                                            vectoradd_hip-hip-amdgcn-amd-amdhsa-gfx1011.s                   vectoradd_hip.cpp-hip-amdgcn-amd-amdhsa.hipfb
vectoradd_hip-hip-amdgcn-amd-amdhsa-gfx1011.bc   vectoradd_hip-host-x86_64-unknown-linux-gnu.bc                  vectoradd_hip.exe
vectoradd_hip-hip-amdgcn-amd-amdhsa-gfx1011.cui  vectoradd_hip-host-x86_64-unknown-linux-gnu.cui                 vectoradd_hip.o
vectoradd_hip-hip-amdgcn-amd-amdhsa-gfx1011.o    vectoradd_hip-host-x86_64-unknown-linux-gnu.o
                                          

ubuntu@ip-172-31-16-166:~/HIP/hip-testsuite/HIP-Examples/vectorAdd$ ./a.out
 System minor 0
 System major 20
 agent prop name  ▒▒
hip Device prop succeeded
Segmentation fault (core dumped)

Environment Details

ubuntu@ip-172-31-16-166:~$ /opt/rocm-5.0.0/bin/rocminfo | less

=====================
HSA System Attributes

Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE

==========
HSA Agents


Agent 1


Name: AMD EPYC 7R32
Uuid: CPU-XX
Marketing Name: AMD EPYC 7R32
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 0
BDFID: 0
Internal Node ID: 0
Compute Unit: 16
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0


Agent 2


Name: gfx1011
Uuid: GPU-XX
Marketing Name: AMD Radeon Pro V520 MxGPU
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 4096(0x1000)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
Chip ID: 29538(0x7362)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 1150
BDFID: 240
Internal Node ID: 1
Compute Unit: 36
SIMDs per CU: 4
Shader Engines: 4
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 80(0x50)
Max Work-item Per CU: 2560(0xa00)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)

### OS Info
ubuntu@ip-172-31-16-166:~/Examples/HIP-Examples/vectorAdd$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.6 LTS
Release:        18.04
Codename:       bionic


How to run mixbench?

user@debian:~/HIP-Examples/mixbench$ make
g++ -c -O2 -I/usr/local/cuda/include -Wall main-cuda.cpp -o main-cuda.o
main-cuda.cpp:9:10: fatal error: cuda.h: No such file or directory
    9 | #include <cuda.h>
      |          ^~~~~~~~
compilation terminated.
make: *** [Makefile:72: main-cuda.o] Error 1

rocm 6.0.2

There is nothing in README or in test_all.sh to guide.

surely include and library paths and compiler are wrong, so some extra arguments needs to be passed

Rodinia backprop and b+tree build failures

ROCm 3.10.0

AMD Radeon R9 Fury X (FIJI, GFX8)

Linux 5.9.9., using upstream amdgpu driver

user@debian:~/HIP-Examples/rodinia_3.0/hip$ HIP_PATH=/opt/rocm-3.10.0/hip make VERBOSE=1
--TESTING: backprop
make[1]: Entering directory '/home/user/HIP-Examples/rodinia_3.0/hip/backprop'
O2 backprop.c -c
make[1]: O2: No such file or directory
make[1]: [Makefile:20: backprop.o] Error 127 (ignored)
O2 facetrain.c -c
make[1]: O2: No such file or directory
make[1]: [Makefile:17: facetrain.o] Error 127 (ignored)
O2 imagenet.c -c
make[1]: O2: No such file or directory
make[1]: [Makefile:26: imagenet.o] Error 127 (ignored)
/opt/rocm-3.10.0/hip/bin/hipcc -I../../common -O2 -c backprop_cuda.cu
backprop_cuda.cu:115:72: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings]
        SimplePerfSerializer* serializeTime = new SimplePerfSerializer("./backprop" );
                                                                       ^
1 warning generated when compiling for gfx803.
backprop_cuda.cu:115:72: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings]
        SimplePerfSerializer* serializeTime = new SimplePerfSerializer("./backprop" );
                                                                       ^
1 warning generated when compiling for host.
/opt/rocm-3.10.0/hip/bin/hipcc backprop.o facetrain.o imagenet.o backprop_cuda.o -o backprop -lm
clang-12: error: no such file or directory: 'backprop.o'
clang-12: error: no such file or directory: 'facetrain.o'
clang-12: error: no such file or directory: 'imagenet.o'
make[1]: *** [Makefile:11: backprop] Error 1
make[1]: Leaving directory '/home/user/HIP-Examples/rodinia_3.0/hip/backprop'
BUILD FAILURE!!
...
...
--TESTING: b+tree
make[1]: Entering directory '/home/user/HIP-Examples/rodinia_3.0/hip/b+tree'
/opt/rocm-3.10.0/hip/bin/hipcc  -I../../common -O2	 ./main.cu \
		-c \
		-o ./main.o
/opt/rocm-3.10.0/hip/bin/hipcc -I../../common -O2  ./kernel/kernel_gpu_cuda_wrapper.cu \
			-c \
			-o ./kernel/kernel_gpu_cuda_wrapper.o \
			-arch sm_20
clang-12: warning: argument unused during compilation: '-arch sm_20' [-Wunused-command-line-argument]
./kernel/kernel_gpu_cuda_wrapper.cu:115:72: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings]
        SimplePerfSerializer* serializeTime = new SimplePerfSerializer("./b+tree_1" );
                                                                       ^
1 warning generated when compiling for gfx803.
./kernel/kernel_gpu_cuda_wrapper.cu:115:72: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings]
        SimplePerfSerializer* serializeTime = new SimplePerfSerializer("./b+tree_1" );
                                                                       ^
1 warning generated when compiling for host.
/opt/rocm-3.10.0/hip/bin/hipcc -I../../common -O2  ./kernel/kernel_gpu_cuda_wrapper_2.cu \
			-c \
			-o ./kernel/kernel_gpu_cuda_wrapper_2.o \
			-arch sm_20
clang-12: warning: argument unused during compilation: '-arch sm_20' [-Wunused-command-line-argument]
./kernel/kernel_gpu_cuda_wrapper_2.cu:114:72: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings]
        SimplePerfSerializer* serializeTime = new SimplePerfSerializer("./b+tree_2");
                                                                       ^
1 warning generated when compiling for gfx803.
./kernel/kernel_gpu_cuda_wrapper_2.cu:114:72: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings]
        SimplePerfSerializer* serializeTime = new SimplePerfSerializer("./b+tree_2");
                                                                       ^
1 warning generated when compiling for host.
gcc	./util/timer/timer.c \
		-c \
		-o ./util/timer/timer.o
./util/timer/timer.c: In function ‘get_time’:
./util/timer/timer.c:26:2: warning: implicit declaration of function ‘gettimeofday’ [-Wimplicit-function-declaration]
   26 |  gettimeofday(&tv, NULL);
      |  ^~~~~~~~~~~~
gcc	./util/num/num.c \
		-c \
		-o ./util/num/num.o
/opt/rocm-3.10.0/hip/bin/hipcc -I../../common -O2	./util/cuda/cuda.cu \
			-c \
			-o ./util/cuda/cuda.o \
			-arch sm_20
clang-12: warning: argument unused during compilation: '-arch sm_20' [-Wunused-command-line-argument]
/opt/rocm-3.10.0/hip/bin/hipcc	./main.o \
		./kernel/kernel_gpu_cuda_wrapper.o \
		./kernel/kernel_gpu_cuda_wrapper_2.o \
		./util/timer/timer.o \
		./util/num/num.o \
		./util/cuda/cuda.o \
		-lm \
                       -o b+tree
executing: ../../test/b+tree/run0.cmd...    Opened file ./b+tree.perf for performance log
Opened file ./b+tree_2.perf for performance log
Closed performance log 2
Opened file ./b+tree_1.perf for performance log
Closed performance log 2
Closed performance log 2
  FAILED!
...


All the other ones do build. (But `hybridsort` does hang my GPU, gfx ring timeout, and causes GPU to be reset, and Xorg to be killed too in the process).

GPU-STREAM git repo is out of date.

GPU-STREAM code is using a deprecated version of hipLaunchKernel
error: no matching function for call to 'hipLaunchKernel'
hipLaunchKernel(HIP_KERNEL_NAME(triad_kernel), dim3(array_size/TBSIZE), dim3(TBSIZE), 0, 0, d_a, d_b, d_c);

Top level make option for linux distros

To be included in a linux distro, like fedora, it is expected that there be some top level controlling build infrastructure like cmake that builds against the distro's version of the hip and installs everything.

Can cmake infra, consistent with the other ROCm projects, be added to this project ?

GPU-STREAM submodule inconsistency

error: Server does not allow request for unadvertised object b3cf9992bb7784fc7219e2edf289ffeb991a234d
Fetched in submodule path 'GPU-STREAM', but it did not contain b3cf9992bb7784fc7219e2edf289ffeb991a234d. Direct fetching of that commit failed.

rtm8 example question

In the RTM8 example, is it right that dim3(gridSize) should be something like dim3(nx, ny, nz) ?
In addition, is the function indexTo1D implemented incorrectly ?

Thanks

int gridSize = 256x256;
int groupSize = 256;
for (int t = 0; t < nt; t++) {
//Launch the HIP kernel
hipLaunchKernelGGL(rtm8, dim3(gridSize), dim3(groupSize), 0, 0, (float*)vsq_d, (float*)current_s_d, (
float*)next_s_d, (float*)current_r_d,(float*)next_r_d, (float*)image_d, (float*)a_d, ArraySize);

inline int indexTo1D(int x, int y, int z){
return x + y * ny + z * ny * nz; // should be x + y * nx + z * ny * nx
}

VectorAdd example doesn't work on build type release using Cmake

When trying out the Code from the VectorAdd example inside a Cmake project,
the execution fails, when using the release flag as build type.
This is due to the NDEBUG flag being set.

This can be circumvented by not using the defined HIP_ASSERT directly, but by storing the return values and then putting those inside the HIP_ASSERT .
I'm not sure why this is exactly the case, because I would have expected that the code inside an ASSERT gets at least executed when NDEBUG is set.

missing '#include <cmath>'

Hi folks

I've got a ubuntu 20.04.3 machine with rocm-5.1.1 installed. I've got an MI25 card, and I'm attempting to build the example code in this repository. The test_all.sh script runs, though on most of the compilations, I get something like this

joe@hermes:~/build/HIP-Examples/cuda-stream$ make
/opt/rocm/hip/bin/hipcc -std=c++11 -O3 -o stream stream.cpp 
In file included from <built-in>:1:
/opt/rocm-5.1.1/llvm/lib/clang/14.0.0/include/__clang_hip_runtime_wrapper.h:50:10: fatal error: 'cmath' file not found
#include <cmath>
         ^~~~~~~
1 error generated when compiling for gfx900.
make: *** [Makefile:11: stream] Error 1

I'm seeing this on other systems as well with the same repo. I can move it past that error if I change the compilation to this

joe@hermes:~/build/HIP-Examples/cuda-stream$ /opt/rocm-5.1.1/hip/bin/hipcc -std=c++11 -O3 -D__HIPCC_RTC__ -o stream stream.cpp
stream.cpp:25:10: fatal error: 'string' file not found
#include <string>
         ^~~~~~~~
1 error generated when compiling for gfx900.

Is there a missing define in the test script/environment? hipconfig --check reports all good.

test_all.sh doesn't work by default

$ /opt/rocm-3.10.0/bin/hipconfig 
HIP version  : 3.10.20465-f9876b8d

== hipconfig
HIP_PATH     : /opt/rocm-3.10.0/hip
ROCM_PATH    : /opt/rocm-3.10.0
HIP_COMPILER : clang
HIP_PLATFORM : hcc
HIP_RUNTIME  : ROCclr
CPP_CONFIG   :  -D__HIP_PLATFORM_HCC__=  -I/opt/rocm-3.10.0/hip/include -I/opt/rocm-3.10.0/llvm/bin/../lib/clang/12.0.0 -I/opt/rocm-3.10.0/hsa/include -D__HIP_ROCclr__

== hip-clang
HSA_PATH         : /opt/rocm-3.10.0/hsa
HIP_CLANG_PATH   : /opt/rocm-3.10.0/llvm/bin
clang version 12.0.0 (/src/external/llvm-project/clang 60f39e2924d51c1e8606f2135f95e9047fb1da5d)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/rocm-3.10.0/llvm/bin
LLVM (http://llvm.org/):
  LLVM version 12.0.0git
  Optimized build.
  Default target: x86_64-unknown-linux-gnu
  Host CPU: znver1

  Registered Targets:
    amdgcn - AMD GCN GPUs
    r600   - AMD GPUs HD2XXX-HD6XXX
    x86    - 32-bit X86: Pentium-Pro and above
    x86-64 - 64-bit X86: EM64T and AMD64
hip-clang-cxxflags : -D__HIP_ROCclr__ -std=c++11 -isystem /opt/rocm-3.10.0/llvm/lib/clang/12.0.0/include/.. -isystem /opt/rocm-3.10.0/hsa/include -D__HIP_ROCclr__ -isystem /opt/rocm-3.10.0/hip/include -O3
hip-clang-ldflags  :  -L/opt/rocm-3.10.0/hip/lib -O3 -lgcc_s -lgcc -lpthread -lm

=== Environment Variables
PATH=/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games

== Linux Kernel
Hostname     : debian
Linux debian 5.9.0-3-amd64 #1 SMP Debian 5.9.9-1 (2020-11-19) x86_64 GNU/Linux
No LSB modules are available.
Distributor ID:	Debian
Description:	Debian GNU/Linux bullseye/sid
Release:	unstable
Codename:	sid
$

But the script and all makefiles fail:

user@debian:~/HIP-Examples$ ./test_all.sh 

==== vectorAdd ====
rm -f ./vectoradd_hip.exe
rm -f vectoradd_hip.o
rm -f ../../../src/*.o
../../../bin/hipcc -g   -c -o vectoradd_hip.o vectoradd_hip.cpp
make: ../../../bin/hipcc: No such file or directory
make: *** [<builtin>: vectoradd_hip.o] Error 127

==== gpu-burn ====
make: /opt/rocm/hip/bin/hipconfig: No such file or directory
rm -rf build
make: /opt/rocm/hip/bin/hipconfig: No such file or directory
mkdir -p build
/opt/rocm/hip/bin/hipcc -I/opt/rocm/hip/include -I/opt/rocm/hcc/include -O3 -c -o build/AmdGpuMonitor.o AmdGpuMonitor.cpp  
make: /opt/rocm/hip/bin/hipcc: No such file or directory
make: *** [Makefile:28: build/AmdGpuMonitor.o] Error 127
./test_all.sh: line 19: ./build/gpuburn-hip: No such file or directory

==== strided-access ====
rm -f strided-access *.o
Makefile:13: *** "Cannot find ../../../bin/hipcc, please install HIP toolkit".  Stop.
./test_all.sh: line 28: ./strided-access: No such file or directory

==== rtm8 ====
Please install rocm package
Using HIP_PATH=
hipcc -std=c++11 -O3 -o rtm8_hip rtm8.cpp
./build_hip.sh: line 23: /bin/hipcc: No such file or directory
./test_all.sh: line 37: ./rtm8_hip: No such file or directory

==== reduction ====
rm -f reduction *.o
Makefile:13: *** "Cannot find ../../../bin/hipcc, please install HIP toolkit".  Stop.
./reduction 1024*1024*4
./run.sh: line 7: ./reduction: No such file or directory
./reduction 8388608
./run.sh: line 7: ./reduction: No such file or directory
./reduction 16777216
./run.sh: line 7: ./reduction: No such file or directory
./reduction 33554432
./run.sh: line 7: ./reduction: No such file or directory
./reduction 67108864
./run.sh: line 7: ./reduction: No such file or directory
./reduction 134217728
./run.sh: line 7: ./reduction: No such file or directory
./reduction 268435456
./run.sh: line 7: ./reduction: No such file or directory
./reduction 536870912
./run.sh: line 7: ./reduction: No such file or directory

==== mini-nbody ====
Please install rocm package
hipcc -I../ -DSHMOO nbody-orig.cpp -o nbody-orig
./HIP-nbody-orig.sh: line 23: /bin/hipcc: No such file or directory
./nbody-orig 1024
./HIP-nbody-orig.sh: line 35: ./nbody-orig: No such file or directory
./nbody-orig 2048
./HIP-nbody-orig.sh: line 35: ./nbody-orig: No such file or directory
./nbody-orig 4096
./HIP-nbody-orig.sh: line 35: ./nbody-orig: No such file or directory
./nbody-orig 8192
./HIP-nbody-orig.sh: line 35: ./nbody-orig: No such file or directory
./nbody-orig 16384
./HIP-nbody-orig.sh: line 35: ./nbody-orig: No such file or directory
./nbody-orig 32768
./HIP-nbody-orig.sh: line 35: ./nbody-orig: No such file or directory
./nbody-orig 65536
./HIP-nbody-orig.sh: line 35: ./nbody-orig: No such file or directory
./nbody-orig 131072
./HIP-nbody-orig.sh: line 35: ./nbody-orig: No such file or directory
./nbody-orig 262144
./HIP-nbody-orig.sh: line 35: ./nbody-orig: No such file or directory
./nbody-orig 524288
./HIP-nbody-orig.sh: line 35: ./nbody-orig: No such file or directory
Please install rocm package
hipcc -I../ -DSHMOO nbody-soa.cpp -o nbody-soa
./HIP-nbody-soa.sh: line 26: /bin/hipcc: No such file or directory
./nbody-soa 1024
./HIP-nbody-soa.sh: line 37: ./nbody-soa: No such file or directory
./nbody-soa 2048
./HIP-nbody-soa.sh: line 37: ./nbody-soa: No such file or directory
./nbody-soa 4096
./HIP-nbody-soa.sh: line 37: ./nbody-soa: No such file or directory
./nbody-soa 8192
./HIP-nbody-soa.sh: line 37: ./nbody-soa: No such file or directory
./nbody-soa 16384
./HIP-nbody-soa.sh: line 37: ./nbody-soa: No such file or directory
./nbody-soa 32768
./HIP-nbody-soa.sh: line 37: ./nbody-soa: No such file or directory
./nbody-soa 65536
./HIP-nbody-soa.sh: line 37: ./nbody-soa: No such file or directory
./nbody-soa 131072
./HIP-nbody-soa.sh: line 37: ./nbody-soa: No such file or directory
Please install rocm package
hipcc -I../ -DSHMOO nbody-block.cpp -o nbody-block
./HIP-nbody-block.sh: line 26: /bin/hipcc: No such file or directory
./nbody-block 1024
./HIP-nbody-block.sh: line 37: ./nbody-block: No such file or directory
./nbody-block 2048
./HIP-nbody-block.sh: line 37: ./nbody-block: No such file or directory
./nbody-block 4096
./HIP-nbody-block.sh: line 37: ./nbody-block: No such file or directory
./nbody-block 8192
./HIP-nbody-block.sh: line 37: ./nbody-block: No such file or directory
./nbody-block 16384
./HIP-nbody-block.sh: line 37: ./nbody-block: No such file or directory
./nbody-block 32768
./HIP-nbody-block.sh: line 37: ./nbody-block: No such file or directory
./nbody-block 65536
./HIP-nbody-block.sh: line 37: ./nbody-block: No such file or directory
./nbody-block 131072
./HIP-nbody-block.sh: line 37: ./nbody-block: No such file or directory

==== add4 ====
rm -f   gpu-stream-hip *.o
../../../bin/hipcc -std=c++11 -O3 -c hip-stream.cpp -o hip-stream.o
make: ../../../bin/hipcc: No such file or directory
make: *** [Makefile:14: hip-stream.o] Error 127
./gpu-stream-hip
./runhip.sh: line 2: ./gpu-stream-hip: No such file or directory
./gpu-stream-hip --groups 256 --groupSize 256
./runhip.sh: line 4: ./gpu-stream-hip: No such file or directory
./gpu-stream-hip --float
./runhip.sh: line 6: ./gpu-stream-hip: No such file or directory
./gpu-stream-hip --float --groups 256 --groupSize 256
./runhip.sh: line 8: ./gpu-stream-hip: No such file or directory

==== cuda-stream ====
rm -f stream *.o
Makefile:13: *** "Cannot find ../../../bin/hipcc, please install HIP toolkit".  Stop.
./test_all.sh: line 72: ./stream: No such file or directory

==== Rodinia ====
--CLEAN: backprop
--CLEAN: bfs
--CLEAN: b+tree
--CLEAN: cfd
--CLEAN: dwt2d
--CLEAN: gaussian
--CLEAN: heartwall
--CLEAN: hotspot
--CLEAN: hybridsort
--CLEAN: kmeans
--CLEAN: lavaMD
--CLEAN: lud
--CLEAN: myocyte
--CLEAN: nn
--CLEAN: nw
--CLEAN: pathfinder
--CLEAN: srad
--CLEAN: streamcluster
--TESTING: backprop
BUILD FAILURE!!
--TESTING: bfs
BUILD FAILURE!!
--TESTING: b+tree
BUILD FAILURE!!
--TESTING: cfd
BUILD FAILURE!!
--TESTING: dwt2d
BUILD FAILURE!!
--TESTING: gaussian
BUILD FAILURE!!
--TESTING: heartwall
BUILD FAILURE!!
--TESTING: hotspot
BUILD FAILURE!!
--TESTING: hybridsort
BUILD FAILURE!!
--TESTING: kmeans
BUILD FAILURE!!
--TESTING: lavaMD
BUILD FAILURE!!
--TESTING: lud
BUILD FAILURE!!
--TESTING: myocyte
BUILD FAILURE!!
--TESTING: nn
BUILD FAILURE!!
--TESTING: nw
BUILD FAILURE!!
--TESTING: pathfinder
BUILD FAILURE!!
--TESTING: srad
BUILD FAILURE!!
--TESTING: streamcluster
BUILD FAILURE!!

==== OpenMP Hello World ====
./test_all.sh: line 87: cd: openmp-helloworld: No such file or directory
CMake Error: The source directory "/home/user/HIP-Examples/rodinia_3.0" does not appear to contain CMakeLists.txt.
Specify --help for usage, or press the help button on the CMake GUI.
make: *** No targets specified and no makefile found.  Stop.
./test_all.sh: line 92: ./test_openmp_helloworld: No such file or directory
user@debian:~/HIP-Examples$ 

It would be nice if it the script and makefiles are smarter.

Doing, $ HIP_PATH=/opt/rocm-3.10.0/hip ./test_all.sh appears to work. But should be documented in README at least I think. Or script should do detection better, if it is not set.

One of the options is to simply to call hipconfig directly and just relay on the user having PATH setup. I think it is more reliable.

gpu-burn/common.h missing `<memory>` header

Hello,

It looks like std::unique_ptr is used without including the <memory> header. This code worked fine with AMD HIP, but I think it still technically needs the header included. (I found this when trying to compiler with HIPCL, which does need the <memory> header include.)

For example, adding something like:
https://github.com/colleeneb/HIP-Examples/blob/f00739a2dd6fe4dca41c1555105d4c332af236a5/gpu-burn/common.h#L9

I can submit a PR if you want, otherwise it's a one-line change so hopefully it's easy to do.

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.