Comments (5)
循环分类与建模
将循环分成几大类,包含可以全并行的,含有寄存器依赖的,内存依赖的,动态边界的。根据这个新增两种指令:
所以一个创新点是根据循环的分类,自定义两种指令,其实本质是: 不是简单的新加指令,而是从指令的角度去看待,给指令增加一点新的语义信息,可以看做是一种语义标签,来使得后面的加速硬件可以识别循环内那几条特殊标记的指令,从而调度循环加速。
from papernotes.
加速架构
-
发掘迭代间的并行. 每个Lane负责某次循环迭代的执行,也就是每个lane需要执行某次迭代内的循环内的指令,类似多核的感觉,实际上每个Lane里面也确实就只有一个核通用CPU一样的ALU。
-
但是其实这种多核加速的加速比大致取决于核的数目,并不高。 好处是每个lane负责一次迭代,就算循环内有分支啥的条件语句啥的也可以想像普通CPU一样执行.
from papernotes.
XLOOP是利用 PyMTL和Pydgin来进行硬件实现和模型仿真
from papernotes.
无依赖可全并行的执行例子
from papernotes.
有内存依赖的执行例子
例如 A[ i] = A[i] +A[i-k]
有内存依赖的情况, 某个Lane如果check到在其负责的迭代里面,它需要的那个数在其他lane没有算好送到它这里,那么要终止本次迭代的执行,下一次再重新执行这一次迭代的指令们
from papernotes.
Related Issues (20)
- PuDianNao: A Polyvalent Machine Learning Accelerator HOT 1
- ShiDianNao: Shifting vision processing closer to the sensor(ISCA15) HOT 6
- GANAX: A UnifiedMIMD-SIMD Acceleration for Generative Adversarial Networks HOT 1
- Co-Designing Accelerators and SoC Interfaces using gem5-Aladdin(MICRO2016) HOT 3
- Scalable LLVM-Based Accelerator Modeling in gem5 HOT 2
- The Hwacha Microarchitecture Manual, Version 3.8.1 HOT 1
- LACore: A Large-Format Vector Accelerator for Linear Algebra Applications(2017) HOT 8
- PyMTLv3:Mamba: Closing the Performance Gap in Productive Hardware Development Frameworks (DAC) HOT 3
- FireSim:FPGA-Accelerated Cycle-Exact Scale-Out System Simulation in the Public Cloud (ISCA2018) HOT 1
- Integrating NVIDIA Deep Learning Accelerator (NVDLA) with RISC-V SoC on FireSim EMC2’19
- Centrifuge: Evaluating full-system HLS-generated heterogenous-accelerator SoCs using FPGA-Acceleration ICCAD’19
- Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks HOT 2
- Gemmini: An Agile Systolic Array Generator Enabling Systematic Evaluations of Deep-Learning Architectures HOT 1
- Integrating NVIDIA Deep Learning Accelerator (NVDLA) with RISC-V SoC on FireSim
- Stream-Dataflow Acceleration ISCA
- Automatic Code Generation for Rocket Chip RoCC Accelerators
- Understanding Reuse, Performance, and Hardware Cost of DNN Dataflows: A Data-Centric Approach (MICRO 2019)
- An Architectural Framework for Accelerating Dynamic Parallel Algorithms on Reconfigurable Hardware(MICRO 2018) HOT 6
- Exploiting Locality in Graph Analytics through Hardware-Accelerated Traversal Scheduling(MICRO2018) HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from papernotes.