Giter VIP home page Giter VIP logo

halutmatmul's Introduction

Halutmatmul

Algorithmic CI

GPU Tests (Vast.ai) PyTest Linting MyPy C++ build

Hardware CI

HW Synth + PAR OpenROAD RTL Linting HW Design Verification

General Information

This repo is used for the algorithmic exploration. I will try to update this repo with as much hardware information as I am allowed to publish.

Install

# install conda environment & activate
conda env create -f environment_gpu.yml
conda activate halutmatmul

# IIS prefixed env
conda env create -f environment_gpu.yml --prefix /scratch/janniss/conda/halutmatmul_gpu

# install CLI
./scripts/install-cli.sh

# now use CLI with
halut --help

# or without install
./halut --help

Hackernews mention (comments only) and discussion

Hardware OpenROAD flow results

All Designs ASAP7 NanGate45
All Report All All
History History History

Total Circuit (M=2)

halut_matmul ASAP7 NanGate45
Area [μm^2] 9643.6787 140647.7656
Freq [Mhz] 666.7 333.3
GE 110.238 kGE 176.25 kGE
Std Cell [#] 68186 68994
Voltage [V] 0.77 1.1
Util [%] 45.0 59.2
TNS -1086.59 -0.31
Clock Net Clock_net Clock_net
Gallery Gallery Viewer Gallery Viewer
Metrics Metrics Viewer Metrics Viewer
Report Report Viewer Report Viewer

Encoder

halut_encoder_4 ASAP7 NanGate45
Area [μm^2] 4844.5405 69711.9531
Freq [Mhz] 666.7 333.3
GE 55.378 kGE 87.358 kGE
Std Cell [#] 34334 33746
Voltage [V] 0.77 1.1
Util [%] 45.0 58.7
TNS 0.0 0.0
Clock Net Clock_net Clock_net
Gallery Gallery Viewer Gallery Viewer
Metrics Metrics Viewer Metrics Viewer
Report Report Viewer Report Viewer

Decoder

halut_decoder ASAP7 NanGate45
Area [μm^2] 4749.8286 68923.7891
Freq [Mhz] 666.7 333.3
GE 54.296 kGE 86.37 kGE
Std Cell [#] 33709 34395
Voltage [V] 0.77 1.1
Util [%] 44.4 58.9
TNS -11340.5098 -0.66
Clock Net Clock_net Clock_net
Gallery Gallery Viewer Gallery Viewer
Metrics Metrics Viewer Metrics Viewer
Report Report Viewer Report Viewer

Progress Slides

Slides preview

CUDA kernels

I am aware that there is still a lot that could be optimized here (warp etc.), but it was only developed for fast analysis

Results

Caveats: No retraining and fine-tuning done yet!

Single Layer replacement with C=32 and K=16

LeViT (Source)

SOTA Vision Transformer on ImageNet 1K LeViT Results

ResNet-50 (only interesting layers in analysis)

Legacy Classifier on ImageNet 1K ResNet-50 Results

Depthwise seperable CNN

on Google Speech v2 DS-CNN Results

C, K and encoding_algorithm parameter sweep for ResNet-50

Data visualizer

Offline learning convergence on ResNet-50

The goal was to find out how much offline training data is needed to get the maximum accuracy.

ResNet-50 Convergence Results

Formalism

Some definitions about the forward path.

Encode kernel

Read and accumulate LUTs kernel

Links

halutmatmul's People

Contributors

joennlae avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.