Giter VIP home page Giter VIP logo

ventus-gpgpu-verilog's Introduction

Ventus GPGPU(Verilog版本)

GPGPU processor supporting RISCV-V extension, developed with Verilog.

Copyright (c) 2023-2024 C*Core Technology Co.,Ltd,Suzhou.

这是“乘影”的Verilog版本,原版(Chisel HDL)链接在这里

乘影开源GPGPU项目网站:opengpgpu.org.cn

目前乘影在硬件设计上还有很多不足,如果您有意愿参与到“乘影”的开发中,欢迎在github上pull request

硬件架构

乘影的硬件架构文档在这里

承影的硬件结构框图:

SM核的硬件结构框图:

综合

我们针对GPGPU进行了DC综合(采用tsmc 28nm工艺),以下是几个重要的配置参数:

  • NUM_THREAD = 32
  • NUM_SM = 2
  • NUM_WARP = 8
  • DCACHE_BLOCKWORDS = 2

在只采用HVT和SVT cell的条件下,GPGPU频率为620MHz,总面积为3.908mm2

开始

以gaussian用例为例,进入testcase/test_gpgpu_axi_top/tc_gaussian:

  • 打开tc.v,选择case的warp数和thread数

src/define/define.v目录下,修改NUM_THREAD,可以更改warp内的thread数量

  • 用VCS仿真:
make run-vcs-4w4t
  • 结果会显示PASSEDFAILED:

  • 用Verdi查看波形
make verdi
  • 如果不需要对外的AXI接口,则进入testcase/test_gpgpu_top/tc_gaussian,步骤同上

测试用例说明

测试集 warp/thread数 是否通过 当前执行周期数 说明
vecadd:向量加法 4w16t pass 1800 64个元素向量加
4w8t pass 2696 64个元素向量加
4w32t pass 2164 64个元素向量加
8w4t pass 2899 64个元素向量加
matadd:矩阵加法 1w32t pass 2808 4*4矩阵加法
1w16t pass 2500 4*4矩阵加法
2w8t pass 2640 4*4矩阵加法
4w4t pass 4054 4*4矩阵加法
nn:最近邻内插法 2w16t pass 2031 19个点找最近的5个点
4w8t pass 4033 28个点找最近的5个点
4w16t pass 2269 53个点找最近的5个点
8w4t pass 3382 19个点找最近的5个点
8w8t pass 2038 53个点找最近的5个点
gaussian:高斯消元 1w16t pass 10151 四元一次线性方程组消元
2w8t pass 11670 四元一次线性方程组消元
4w4t pass 11537 四元一次线性方程组消元
4w8t pass 15940 五元一次线性方程组消元
bfs:宽度优先搜索算法 2w16t pass 20938
4w8t pass 22730
4w32t pass 36114
8w4t pass 40888

注:当前由于DCACHE_BLOCKWORDS较小,执行周期数会比较长,当DCACHE_BLOCKWORDS增大的时候,
执行周期会有比较大的改善,这里只是为了评估不同NUM_THREAD下GPGPU的执行效率

致谢

我们在开发Ventus GPGPU时参考了一些开源设计

Sub module Source Detail
CTA scheduler MIAOW Our CTA scheduler module is based on MiaoW ultra-threads dispatcher
L2Cache block-inclusivecache-sifive Our L2Cache design is inspired by Sifive's block-inclusivecache
FPU XiangShan We reused Array Multiplier in XiangShan. FPU design is also inspired by XiangShan
SFU openhwgroup Our SFU module is based on pulp-platform
Config, ... rocket-chip Some modules are sourced from RocketChip

ventus-gpgpu-verilog's People

Contributors

jialaolian-oss avatar gaofanhui avatar jules-kong avatar ggggzzzzhhhh avatar llhe110 avatar

Stargazers

 avatar czc avatar  avatar  avatar fatih avatar CoderEndING avatar Qu Yuwei avatar ZhouJing(周晶) avatar  avatar Lu ZHANG avatar  avatar  avatar YourDecision avatar  avatar  avatar  avatar  avatar  avatar QGL_MAX avatar  avatar mrpp avatar  avatar Wanderer avatar Gabriel Wu avatar Hu He avatar  avatar

Watchers

 avatar  avatar  avatar

ventus-gpgpu-verilog's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.