Comments (5)
Interestingly, this fuses:
testRewrite3 :: Matrix -> Double
testRewrite3 (Matrix r c v) = U.sum . flip (\(r,c,v) j -> U.generate r (\i -> v `U.unsafeIndex` (j + i * c))) 0 $ (r,c,v)
from dh-core.
It seems that pattern matching is the cause?
This also fuses:
testRewrite4 :: Matrix -> Double
testRewrite4 m = U.sum . flip (\m1 j -> U.generate (rows m1) (\i -> (_vector m1) `U.unsafeIndex` (j + i * (cols m1)))) 0 $ m
from dh-core.
Bottom line:
column :: Matrix -> Int -> Vector Double
column (Matrix r c v) j= U.generate r (\i -> v `U.unsafeIndex` (j + i * c))
{-# INLINE column #-}
column2 :: Matrix -> Int -> Vector Double
column2 m j = U.generate (rows m) (\i -> (_vector m) `U.unsafeIndex` (j + i * (cols m)))
{-# INLINE column2 #-}
testRewrite0 :: Matrix -> Double
testRewrite0 a = (U.sum . flip column2 0) a
testRewrite1 :: Matrix -> Double
testRewrite1 a = (U.sum . flip column 0) a
testRewrite0
fuses, testRewrite1
doesn't.
However, the fusion is depending on the INLINE
and I can't get M.column2 to get inlined across modules (moving the definition of column2
from the main file to Matrix.hs
)
from dh-core.
The results are actually a bit disappointing, I'll keep digging
from dh-core.
Back to it! Turns out GHC makes wonders with rewrite rules/inlining. So throwing in for example:
{-# INLINE [1] row #-}
{-# RULES
"row/fuse" forall c v r i. row (M.Matrix r c v) i = U.slice (c*i) c v
#-}
works amazing.
It gives us some nice stuff:
Say I have matrices A,B, and C = A*B.
Then now I can take the sum of a row of C, without allocating the whole matrix, and a ~10x speedup on my machine.
I would expect to hit a point of diminishing return though, so that's to be explored
from dh-core.
Related Issues (20)
- datasets: add unit tests HOT 7
- datasets : split off datasets-core
- analyze : evaluate `streaming` for RFrame HOT 9
- analyze: generate and check random test data HOT 17
- datasets : add ARFF format HOT 10
- datasets: remove data-default-class HOT 1
- datasets: Add fashion-mnist
- Cross validation layer HOT 1
- Bump Stackage to latest Nightly
- datasets: fix benchmark dataset folder HOT 1
- Add test coverage HOT 1
- Cannot build project on macOS 10.14.3 HOT 4
- dense-linear-algebra : Add chronos-bench benchmarks HOT 4
- dense-linear-algebra : Weird memory and runtime behavior from `generateSym` HOT 2
- Add QMNIST
- dense-linear-algebra: Add support for SIMD instructions
- BostonHousing data set URL needs to be updated. HOT 1
- Cut new release
- Move CI from Travis to Github Actions
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dh-core.