Giter VIP home page Giter VIP logo

marina-neseem / accera-high-perf-dl Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 0.0 38 KB

Case Studies for using Accera - the open source cross-platform compiler from Microsoft Research - to create high performance deep learning computations (i.e. GEMM, Convolution, etc.)

License: MIT License

Python 100.00%
grid-search high-performance-computing machine-learning-algorithms convolution gemm-optimization

accera-high-perf-dl's Introduction

Accera High Perf DL

High Performance Deep Learning using Accera

Overview

This repo presents some case studies for using Accera - the open source cross-platform compiler from Microsoft Research - to create high performance deep learning computations (i.e. GEMM, Convolution, etc.).

Contents

Documentation

Refer to the original Accera Manual to get familiar with Accera concepts and Domain Specific Language (DSL).

Setup

Accera requires Python 3.7-3.10. For those case studies, you can simply install the required libraries using

pip install -r requirements.txt

To build Accera from the source, refer to Accera's Install Instructions.

NOTE: Those case studies are tested on Python 3.9.10 and Accera v1.2.9.

Run the Case Studies

Each case study is located in a separate folder under the case_studies folder. To run any case study, you can simply change directory to the required case study, then run

python run.py

Moreover, each case study has a README file to explain the details of what the case study is trying to do.

NOTE: We will continiously add more case studies to show how to use Accera to build high performance deep learning computations.

Contribute

To increase the impact of this repository, we welcome and encourage contributions with new case studies using Accera. Before you work on any, it is advised that you follow the same style as MatMul Grid Search Case Study as follows:

  1. Create run.py which contains the main code for the case study (mainly the Accera DSL).
  2. Create utils.py which can include any needed utility functions.
  3. Create README.md which contains a detailed explanation of what the case study is trying to achieve plus any results.

Following this template ensures that your work can be merged to the master branch in a timely manner.

License

MIT License. See LICENSE file

NOTE: For any questions about the case studies, feel free to reach out to me at [email protected].

accera-high-perf-dl's People

Contributors

marina-neseem avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

accera-high-perf-dl's Issues

Minor Feedback on Unrolled Convolution Case Study

Looks good, some minor comments below:

There are two different methods to implement unrolled convolution. <ins>In the first method</ins>, we can leverage caching and tiling to achieve the same behaviour of unrolled convolution without explicitly using im2col and col2im operations. <ins>In the second method</ins>, we can split the unrolled convolution into different steps, and create a separate schedule for each step and finally fuse them together. <ins>In this case study, we will apply the first method.</ins>

Comment: [Editorial] probably won't need the underlining for "in the first method" and "in the second method" (I was expecting those to be clickable links). The emphasis is good enough for the "In this case study, we will apply the first method."

First, we define the input and output arrays

Comment: input, output, and weights arrays

> **_Note_** `sample` can be used to specify the size of a random sample from the parameters grid (for testing purposes).

Comment: minor rendering issue with the Note (needs to be moved out of the code block probably)

Finally, we can use the optimal parameters to create a Accera package with the best performant function. We can do ths by repeating [Step 1](#step-1---create-a-parameterized-accera-unrolled-convolution-function) and replace the values in `parameters_values` by the optimal values.

Comment: typo in "We can do ths by..."

Changes in Role Attribute

Dear Marina,

First and foremost, I would like to express my deep appreciation for providing the insightful case studies.

I have recently noticed a change in the latest version of accera that requires some adjustments in the way roles are defined. It appears that the definition of the role has been updated from acc.Array.Role to simply acc.Role, and it is no longer an attribute of the Array. This change necessitates a modification in the code for each case study to ensure compatibility with the current version of accera.

For instance, in the wrong version:
Input = acc.Array(role=acc.Array.Role.INPUT, element_type=acc.ScalarType.float32, shape=(input_rows, input_columns, input_channels))

The correct version should now be:
Input = acc.Array(role=acc.Role.INPUT, element_type=acc.ScalarType.float32, shape=(input_rows, input_columns, input_channels))

Best Regards,
Batuhan

Feedback on NCHWC case study

Thanks for contributing the case study! Here are some comments to consider:

# Case Study - NCHWc 2D Convolution Grid Search

Comment: In the beginning of this case study, it'll be helpful to expand the acronym / symbols before diving into the details. For example, something like the bulletpoint list here (but feel free to explain it in a different way): https://stackoverflow.com/questions/37689423/convert-between-nhwc-and-nchw-in-tensorflow

[//]: # (Project: Accera)
[//]: # (Version: 1.2.3)

[//]: # (Project: Accera)
[//]: # (Version: 1.2.3)

Comment: can remove these headings, because this is a community case study and not part of the Accera repo. Once you've done that, the git SHA for the other file also needs to be updated in the TOC (you can piggyback on the PR you already have)

p_s_split_2_size, p_n_split_2_size, p_n_split_3_size = acc.create_parameters(6)

Comment: starting with v1.2.4, we've removed the need to specify the number of parameters as an argument. You can upgrade by doing this:

pip install --upgrade accera-llvm
pip install --upgrade accera-compilers
pip install --upgrade accera

And then replace create_parameters(num) with create_parameters()

For this case study, we choose the NCHWc input channels block size as well as NCHWc output filters block size to be 8 because this optimizes the usuage of the SIMD registers in the target architecture (AVX2). However, for different architectures like (AVX512), 16 would be a better choice. We will assume that the input and the output tensors are pre and post re-ordered using external functions, and we will focus on the convolution implementation itself.

Comment: I'm trying to find more background on the correlation between 8/16, and AVX2/AVX512. Is this recommendation based on the width of the vector registers? If so, it will be helpful to add the reasoning.

In this step, we define a function `get_optimal_parameters` to get the optimal parameters. We achieve this by loading the data from the HAT file to a pandas dataframe, and then the optimal parameters choice would be the one with minimum mean duration in seconds.

In this step, we define a function `get_optimal_parameters` to get the optimal parameters. We achieve this by loading the data from the HAT file to a pandas dataframe, and then the optimal parameters choice would be the one with minimum mean duration in seconds.

Comment: I missed this earlier. Should we add a requirements.txt file listing the dependencies for each case study? This will be self-documenting and tie each case study to a specific snapshot/version of Accera (so you don't have to keep on updating if we change the DSL).

For example, I think this will suffice:

accera==1.2.4
pandas

You can either do one requirements.txt in each folder (to track dependencies separately for each case study), or just one top level requirements.txt. Either is fine.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.