Giter VIP home page Giter VIP logo

pocl's Introduction

Portable Computing Language (PoCL) for Ventus GPGPU

See README-pocl.md for original README.

See https://github.com/THU-DSP-LAB/llvm-project for detailed build guide with Ventus llvm based OpenCL Compiler and icd loader.

Original PoCL build instruction doesn't work for Ventus GPGPU.

TODOs

TODOs are divided into 3 parts, first part contains jobs required to make sure OpenCL tests can be tested with pocl+spike, second part contains jobs required to enable real Ventus GPGPU work flow, third part contains generic TODOs.

TODOs (Part 1)

  • Correctly report Ventus GPGPU virtual device after pocl is statically/dynamally linked with spike(Currently ventus device is hardcoded).

  • Determine how we implement workitem builtins, if using workitem.S implementation from ventus-llvm, some extra work should to be done in ventus pocl driver, otherwise we need to redesign how workitem builtins are implemented.

    1). Functions such as insertPrologue from ParallelRegion.cc inserts instructions to get local_id_x/y/z which are preset by WorkitemReplication::ProcessFunction, this behavior is weird.

    2). Workitem variables local_size_x/y/z can have dynamic size which to be determined at runtime, this is done in pocl WorkitemHandler.

  • Make sure pocl ventus driver is correctly initialized, such as pocl_ventus_init is double checked, basically every items in struct _cl_device_id should be properly initialized.

  • Make sure there is no native(host) device related code in ventus pocl driver, such as kernel.so in function llvm_codegen should be renamed to kernel.elf or something else(a static elf file). Also the native kernel execution dlopen(kernel.so) should be replaced by an elf loader(Should already be done by spike fesrv elf loader).

  • Kernel metadata buffer and kernel args buffer should be prepared in pocl, then passed to spike by asking spike to store those buffer into the lowerest address of physical memory of Ventus GPGPU, related CSRs should be initialized with the kernel metadata buffer base address.

  • Check the behavior of all the OpenCL APIs for Ventus GPGPU are correctly coded(Some API implementations in common.c may not work for Ventus GPGPU), make sure all the examples and tests can be kicked off(may not run through) to spike(via spike elf loader).

  • Device side printf support. A buffer with size PRINTF_BUFFER_SIZE on device side should be reserved, and a CSR or event may be needed to notify OCL driver to copy device side printf buffer to host side.

TODOs (Part 2)

  • Ventus GPGPU kernel mode driver(kmd) should be provided and bridged with pocl.
  • Correctly report Ventus GPGPU device after pocl(umd) can locate ventus kmd and read necessory information from the kmd(such as supported extensions etc).

TODOs (Part 3)

  • Fix hardcoded library search path for libworkitem.a in ventus.c.

pocl's People

Contributors

franz avatar pjaaskel avatar eschnett avatar jrprice avatar vkorhonen avatar csanchezdll avatar topileppanen avatar oblomov avatar anbe42 avatar isuruf avatar elhigu avatar krrishnarraj avatar jansol avatar jules-kong avatar jwtowner avatar yangzexia avatar dsandersllvm avatar matthiasdiener avatar ardacoskunses avatar 0charleschen0 avatar yangkex avatar loganchien avatar rabijl avatar trixirt avatar vinsteri avatar dodohack avatar kolanich avatar koskinel avatar thekiterunner24 avatar victoroliv2 avatar

Stargazers

ZhouJing(周晶) avatar Hu He avatar

Watchers

 avatar

pocl's Issues

[bugfix] CTS中bufferreadwriterect case提示read_rect为NULL

该case是在image中读、写、拷贝及映射一块矩阵区域,在pocl_ventus.cc中对应的回调接口未实现,现添加对应的回调接口:read_rect()、write_rect()、copy_rect()、map_mem()、unmap_mem()、get_mapping_ptr()可通过测试。

[BUG] OpenCL-CTS test_conversions failed

执行测试build/test_conformance/conversions/test_conversions int_sat_rte_float失败,分析是测试代码在test_conformance/conversions/basic_test_conversions.cpp中使用了clEnqueueMapBufferclSetEventCallback等函数,导致pocl设置初始数据错误,程序在执行start段还未进入kernel时崩溃。执行上面的测试命令并查看log文件test_convert_int_sat_rte_float_0.log报错如下:
Screenshot from 2024-07-03 09-45-51

[bugfix] CTS中arraycopy case提示copy为NULL

该case是数组拷贝的测试,在pocl_ventus.cc中对应的回调接口copy未实现。case中创建buffer时采用了CL_MEM_USE_HOST_PTR flag,当前ventus对该flag的处理是在设备端开辟内存空间,将数组拷贝到设备端,所以copy接口需要先从设备端拷到host侧,再将数据拷贝到copy回调传入设备端的dst地址,添加对应的回调接口可通过测试。

[CTS] compiler issue

目前跟这个clCompileProgram接口相关的问题已经定位到,最终调用pocl的接口时候,opencl的kernel_metadata还没有设置正确,测试用例:

__kernel void
CopyBuffer(
    __global float* src,
    __global float* dst )
{
    int id = (int)get_global_id(0);
    dst[id] = src[id];
}

./test_compiler simple_compile_only

[CTS] Buffers目录下的测试问题

目前这个测试集下的所有测试都跑了一遍,目前

  • buffer_fill_uint
  • buffer_fill_int
  • buffer_fill_short
  • buffer_fill_ushort
  • buffer_fill_char
  • buffer_fill_uchar
  • buffer_fill_long
  • buffer_fill_ulong
  • buffer_fill_float
  • buffer_fill_struct

这几个子测试暴露出来的问题是一样的,pocl/lib/CL/devices/common.c:370: void pocl_exec_command(_cl_command_node *): Assertion 'dev->ops->memfill' failed', 定位到 pocl_ventus.cc 文件, ops->memfill = NULL

vector_swizzle的问题

这个问题还是出现在参数处理

log信息
dump信息

源码


__kernel void test_vector_swizzle_xyzw(char2 value, __global char2* dst) {
    int index = 0;

    // lvalue swizzles
    dst[index++].x = value.x;
    dst[index++].y = value.x;
    dst[index++].xy = value;
    dst[index++].yx = value;

    // rvalue swizzles
    dst[index++] = value.x;
    dst[index++] = value.y;
    dst[index++] = value.xy;
    dst[index++] = value.yx;
}


Question About POCL Cmake

Hi there,

when I reading a code of cmake file in pocl I saw that
option(ENABLE_VENTUS "Enable Ventus GPGPU device driver." OFF)
I want to ask why ventus is not enable.

By the way I wonder how did you designed and integrated pocl to ventus. Is there any paper or source about it. Thank you for great work. Have a nice day.

pocl 冗余log信息问题

结合log信息

### Triple: riscv32, CPU: ventus-gpgpu
Warning: the memory at  [0x90000000, 0x90000017] has been realigned
to the 4 KiB page size: [0x90000000, 0x90000FFF]
to allocate at 0x90000000 with 4096 bytes 
to copy to 0x90000000 with 24 bytes
Warning: the memory at  [0x90001000, 0x90001017] has been realigned
to the 4 KiB page size: [0x90001000, 0x90001FFF]
to allocate at 0x90001000 with 4096 bytes 
to copy to 0x90001000 with 24 bytes
Warning: the memory at  [0x90002000, 0x90002017] has been realigned
to the 4 KiB page size: [0x90002000, 0x90002FFF]
to allocate at 0x90002000 with 4096 bytes 
to copy to 0x90000000 with 24 bytes
to copy to 0x90001000 with 24 bytes
to copy to 0x90000000 with 24 bytes
to copy to 0x90001000 with 24 bytes
notice that ventus hasn't support local buffer as argument yet.
Warning: the memory at  [0x90003000, 0x9000300B] has been realigned
to the 4 KiB page size: [0x90003000, 0x90003FFF]
to allocate at 0x90003000 with 4096 bytes 
to copy to 0x90003000 with 12 bytes
to allocate at 0x90004000 with 131072 bytes 
Warning: the memory at  [0x90024000, 0x9002403F] has been realigned
to the 4 KiB page size: [0x90024000, 0x90024FFF]
to allocate at 0x90024000 with 4096 bytes 
to copy to 0x90024000 with 64 bytes

其实以上有很多打印信息是无用信息,是否可以设置一个开关,类似POCL_DEBUG=all这种,控制打印行为,就比如我跑测试时,其实只关注测试成功与否,并不想看这些信息,看信息我会去log文件看是吧,测试出错的时候,我才会想看到具体的程序报错,其他的控制台信息我们是不会关心的

[component] A brief description of the problem

Current results

Picture or word description.

Expected results

Picture or word description.

Reproduction method

Describe how to reproduce the current problem in as much detail as possible, including the reproduction environment and commands.

Description

Describe the problem in as much detail as possible.

CTS 测试相关的问题 - compiler test

1 : compiler_defines_for_extensions
generic_address_space在OpenCL中算feature支持,不是extension,相关修改在53b4d09 (#13)这个PR

2: core dump的问题
image_macro
simple_compile_only
simple_static_compile_only
simple_extern_compile_only
debug这些用例的时候,发现是kernel的函数名没有被识别,所以应该是pocl对应的实现机制没有完善?
Screenshot from 2023-08-11 13-44-02

[bug] pocl中private memory越界问题

pocl中固定为每个thread分配0x1000的private memory。
当cts中测试case 需要的private memory大于0x1000时候,spike会出现memory无法访问的问题。

临时解决方法:
分配更大的private memory,来保证case通过。

此解决办法需要进一步讨论

[CTS] over 4G

Now spike run workgroups one by one, to fix the over 4G.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.