Giter VIP home page Giter VIP logo

gen-dnn's People

Contributors

aaraujom avatar akharito avatar andersanton avatar ankalinin avatar densamoilov avatar dyoussif avatar dzarukin avatar echeresh avatar envsp avatar espetrov avatar h-sadia avatar igorsafo avatar irinasok avatar itaraban avatar kealan-barbieri avatar kruusnec avatar kwiersch avatar mgouicem avatar mistler avatar msotoflo avatar nastafie avatar nivas-x86 avatar nshustrov avatar petercad avatar qyi1 avatar rjoursler avatar shelleygoel avatar skazakov1 avatar tprimak avatar vpirogov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gen-dnn's Issues

FYI: upcomming changes in MKL-DNN reorders

Just a heads up.

MKL-DNN developers are going to replace all our reorders with the jitted versions, cause we have too many formats (and hence reorders) now and a bunch of new is coming. So it becomes really hard to handle those and we decided to go with auto-generated version that would support arbitrary reorders (at least in theory).

Since gen-dnn doesn't rely on jit the only version that will remain is ref any<->any that is used here.

Please let me know if you have any concerns -- we can try to make the change in way it would not affect you that much. For instance we can consider leaving C-code for some popular plain format transforms like nchw <-> nhwc. The only problem I see with this approach is that we are thinking on switching to another memory layout representation (which would cover 2+-level blocking, that is currently cannot be expressed in blocking_desc). If we do so it would be really hard to have some specialized plain-C reorder versions of transforms, but would be much simpler to generate the jitted code.

NEC Aurora Optimization bugs

Low Priority TODO

  • Clean up code: code unreadable with #if..#pragma..#else..#pragma..#endif clutter
    • Try __Pragma macro approach
    • Try awk transforms to produce __ve/_SX/LINUX system-specific .cpp files
  • -O2 ? remove -ffast-math ? (dumb down the "safe" compiler flags).

Description

Some Benchdnn impls run correctly under gcc-5, gcc-7 and sxcc, but produce wrong
results for ncc at default Release flags (-O3 -ffast-math, ...)

Environment

ncc version 0.21 or 0.23

Steps to reproduce

./build.sh -qaT # on zoro or aurora (possibly different ncc versions).
./regr.sh ALL # on aurora, to ve_exec the cross-compiled binaries

Actual behavior

test failures : ref_conv4 was disabled for __ve and now sx_conv3 BWD_D is not
able to run correctly. Checked many git versions on x86 and ncc, but did not find
any that ran correctly on Aurora.

Expected behavior

Zero test failures (and still no failures if ref_conv4 tests are reintroduced).

mkl-dnn v1.0 merge

ongoing v1.0 upstream merge has some issues

  • v1.0 has a gemm_driver, which JITs the gemm calls
  1. USE_CBLAS and ref_gemm and USE_MKL[deprecated!] options now all have various problems -- see oneapi-src#415 for a description of this and next issue.
  2. rnn code emits some calls with leading dimension of zero (MKL cblas_sgemm spews ERROR messages, but tests still pass).
  • v1.0 now has an openCL engine. We should consider creating a separate vanilla_engine, and view an eventual ve_engine as adding ve-jit support to a vanilla_engine. (How can chaining of engines be done nicely?). ve-jit needs a bit more ve-specifics, that perhaps harmonize with the higher-level calls in xbyak.

Similar progress to be made for porting v1.1 and v1.2

v0.12 upstream merge

Upstream merge New Features:

  • x86 compile OK

  • benchdnn TEST mode support for post_ops

  • benchdnn ALL support for attributes (post_ops)
    - "v2" iterator: attr is copied, but hint and op_desc need long lifetime

  • ncc Internal Compiler Error ... constructor arg with 'auto &' gives
    nc++: Internal Error: Unknown type kind.
    for benchdnn (bnorm. cpp, conv.cpp, maybe others)

  • gtests failure (Aurora only):
    31/33 Test #31: test_convolution_backward_data_f32 .........***Failed

Add a decent non-JIT ngchw convolution to vanilla engine

Description

Generic convolution is really slow, because the generic mkl-dnn implementation is
impossible to optimize (it calls virtual functions to calculate offsets within inner loops).

Benchdnn has identified some simple mods to speed up convolutions for ngchw data layout,
so some decent version should be added to the vanilla engine. (The impl might differ when
compiling for x86 / SX / Aurora chipsets). Maybe we can add it to the jit version too, to compare
speeds with jit impls. (?)

At very least, test times should be much improved - for SX, the full suite of gtests took
so long to run that I rarely ran it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.