Comments (9)
I like Streamly! And it's easier to work with (IMO) than Streaming. I've used both and compared (a little) for my map-reduce-folds library. It often seems faster than Vector as well.
But Vectors have advantages for storage capacity right? At least in Frames, but that's using Storable vectors for columns which is different from what you are talking about here.
from dh-core.
Closed by #54 . Thanks to @Magalame !
from dh-core.
Streamly seems to have some good performances compared to the other streaming libraries.
from dh-core.
I think some operations that would constitute good benchmarks would be: ==
,update
,takeRows
. It could be a basis
from dh-core.
I wrote the following benchmark. So far two things are tested:
- merely comparing two frames.
Streamly
seems to have a marginal advantage takeRows
, whereStreamly
seems to show some weakness
{-# LANGUAGE GADTs #-}
module Main where
import Streamly
import qualified Streamly.Prelude as S
import Control.Monad.Catch (MonadThrow (..))
import Data.Function ((&))
import Data.Traversable (sequence)
import Control.Monad
import Analyze.Common (Data (..), DuplicateKeyError (..), RowSizeMismatch (..))
import qualified Analyze as A
import qualified Analyze.Common as AC
import qualified Data.Vector as V
import Data.Vector (Vector)
import qualified Data.Text as T
import Data.Text (Text)
import Data.HashMap.Strict (HashMap)
import qualified Data.HashMap.Strict as HM
import qualified Data.HashSet as HS
import System.Random
import qualified System.Random.MWC as M
import qualified Criterion.Main as C
import Control.DeepSeq (deepseq)
-- implementation details required
data RFrameUpdateS k v m where
RFrameUpdateS :: MonadThrow m => {
_rframeUpdateKeys :: !(Vector k),
_rframeUpdateData :: !(SerialT m (Vector v))
} -> RFrameUpdateS k v m
data RFrameS k v m where
RFrameS :: MonadThrow m => {
_rframeKeys :: !(Vector k),
_rframeLookup :: !(HashMap k Int),
_rframeData :: !(SerialT m (Vector v))
} -> RFrameS k v m
fromUpdate :: (Data k, MonadThrow m) => RFrameUpdateS k v m -> m (RFrameS k v m)
fromUpdate (RFrameUpdateS ks vs) = AC.checkForDupes ks >> pure (RFrameS ks (AC.makeLookup ks) vs)
update :: (Data k, MonadThrow m) => RFrameUpdateS k v m -> RFrameS k v m -> m (RFrameS k v m)
update (RFrameUpdateS uks uvs) (RFrameS fks _ fvs) = do
fSize <- S.length fvs
uSize <- S.length uvs
if fSize /= uSize
then throwM (RowSizeMismatch fSize uSize)
else do
AC.checkForDupes uks
let kis = AC.mergeKeys fks uks
ks' = (\(k, _, _) -> k) <$> kis
look' = AC.makeLookup ks'
vs' = S.zipWith (AC.runIndexedLookup kis) fvs uvs
return (RFrameS ks' look' vs')
-- only concerned with generating data
n :: Int
n = 1000
testKeys :: IO (Vector Text)
testKeys = V.replicateM n $ liftM (T.pack . take 10 . randomRs ('a','z')) newStdGen
testData :: IO (Vector (Vector Double))
testData = do
gen <- M.create
V.replicateM n $ M.uniformVector gen n
testDataS :: IO (SerialT IO (Vector Double))
testDataS = do
vec <- testData
return $ S.fromFoldable vec
-- the actual benchmarks
cmprVec :: IO Bool
cmprVec = do
keys <- testKeys
dat <- testData
let
update = A.RFrameUpdate keys dat
frame1 <- A.fromUpdate update
frame2 <- A.fromUpdate update
return $ frame1 == frame2
cmprStream :: IO Bool
cmprStream = do
keys <- testKeys
dat <- testDataS
let
update = RFrameUpdateS keys dat
frame1 <- fromUpdate update
frame2 <- fromUpdate update
let
sameKeys = _rframeKeys frame1 == _rframeKeys frame2
sameLookup = _rframeLookup frame1 == _rframeLookup frame2
dat1 = _rframeData frame1
dat2 = _rframeData frame2
sameDat <- (S.zipWith (==) dat1 dat2 & S.notElem False)
return $ sameKeys && sameLookup && sameDat
cmpr f1 f2 = do
let
sameKeys = _rframeKeys f1 == _rframeKeys f2
sameLookup = _rframeLookup f1 == _rframeLookup f2
dat1 = _rframeData f1
dat2 = _rframeData f2
sameDat <- (S.zipWith (==) dat1 dat2 & S.notElem False)
return $ sameKeys && sameLookup && sameDat
takeRowsVec :: IO [Bool]
takeRowsVec = do
keys <- testKeys
dat <- testData
let
update = A.RFrameUpdate keys dat
frame <- A.fromUpdate update
let
postTake = zipWith A.takeRows ((div n) <$> [5,4,3,2,1]) (repeat frame)
return $ zipWith (==) (repeat frame) postTake
-- | Takes first 'n' rows of an 'RFrame'.
takeRows n (RFrameS ks look srm) = RFrameS ks look (S.take n srm)
takeRowsS :: IO [Bool]
takeRowsS = do
keys <- testKeys
dat <- testDataS
let
update = RFrameUpdateS keys dat
frame <- fromUpdate update
let
postTake = zipWith takeRows ((div n) <$> [5,4,3,2,1]) (repeat frame)
sequence $ zipWith cmpr (repeat frame) postTake
main :: IO()
main = C.defaultMain [ C.bgroup "Tests" [ C.bench "Vec" $ C.whnfIO cmprVec
, C.bench "Stream" $ C.whnfIO cmprStream
, C.bench "takeRowsVec" $ C.whnfIO takeRowsVec
, C.bench "takeRowsStream" $ C.whnfIO takeRowsS]]
I get this (at most a 5% improvement, in the worst case 20% decrease in performance)
benchmarking Tests/Vec
time 50.78 ms (48.75 ms .. 52.79 ms)
0.997 R² (0.994 R² .. 0.999 R²)
mean 51.72 ms (50.33 ms .. 53.85 ms)
std dev 3.417 ms (1.287 ms .. 5.131 ms)
variance introduced by outliers: 22% (moderately inflated)
benchmarking Tests/Stream
time 49.05 ms (47.58 ms .. 50.81 ms)
0.997 R² (0.994 R² .. 0.999 R²)
mean 49.30 ms (48.12 ms .. 50.91 ms)
std dev 2.704 ms (1.691 ms .. 4.099 ms)
variance introduced by outliers: 14% (moderately inflated)
benchmarking Tests/takeRowsVec
time 41.39 ms (39.96 ms .. 42.73 ms)
0.995 R² (0.986 R² .. 0.999 R²)
mean 44.10 ms (42.83 ms .. 46.00 ms)
std dev 3.021 ms (1.645 ms .. 4.291 ms)
variance introduced by outliers: 20% (moderately inflated)
benchmarking Tests/takeRowsStream
time 51.69 ms (50.14 ms .. 53.24 ms)
0.997 R² (0.993 R² .. 0.999 R²)
mean 54.46 ms (53.17 ms .. 56.48 ms)
std dev 3.000 ms (2.079 ms .. 4.349 ms)
variance introduced by outliers: 14% (moderately inflated)
It is very likely that I'm doing something wrong though, but I'm not sure where.
It seems that streamly makes the interface inherently more complicated. For example comparing two RFrames
, (==)
becomes useless, and we can't just derive (Eq)
our way out of it.
from dh-core.
Thank you @Magalame ! Well a 10% average speed improvement is already pretty cool but I think the real selling point of streaming would be if the resulting program used "constant" (with respect to the stream length) memory. Could you try adding a weigh
benchmark as well?
from dh-core.
You're very welcome @ocramz !
Eventually I rethought my benchmarking process and unsurprisingly it was flawed. I modified it such that the test data would only be created once at runtime (using unsafePerformIO
), and I removed the cost of creating said data from the benchmark. And we actually end up with some nice surprises:
benchmarking Tests/Vec -- compares two identical frames
time 8.207 ms (8.065 ms .. 8.324 ms)
0.999 R² (0.998 R² .. 0.999 R²)
mean 8.297 ms (8.170 ms .. 8.524 ms)
std dev 471.1 μs (178.4 μs .. 761.0 μs)
variance introduced by outliers: 29% (moderately inflated)
benchmarking Tests/Stream -- same here, but with Streamly as backend
time 5.637 ms (5.598 ms .. 5.692 ms)
0.999 R² (0.999 R² .. 1.000 R²)
mean 5.643 ms (5.619 ms .. 5.672 ms)
std dev 84.95 μs (63.52 μs .. 115.0 μs)
benchmarking Tests/takeRowsVec 1000 -- performance when taking 1000 rows out of a frame with 1000 rows
time 7.628 ms (7.567 ms .. 7.683 ms)
0.999 R² (0.999 R² .. 1.000 R²)
mean 7.752 ms (7.698 ms .. 7.840 ms)
std dev 196.6 μs (117.4 μs .. 287.3 μs)
benchmarking Tests/takeRowsVec 500
time 4.044 ms (4.020 ms .. 4.068 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 4.066 ms (4.047 ms .. 4.089 ms)
std dev 65.60 μs (48.28 μs .. 96.86 μs)
benchmarking Tests/takeRowsVec 200
time 1.981 ms (1.851 ms .. 2.081 ms)
0.989 R² (0.986 R² .. 0.998 R²)
mean 1.886 ms (1.861 ms .. 1.915 ms)
std dev 96.98 μs (73.04 μs .. 131.7 μs)
variance introduced by outliers: 36% (moderately inflated)
benchmarking Tests/takeRowsVec 100
time 1.337 ms (1.328 ms .. 1.349 ms)
0.999 R² (0.999 R² .. 1.000 R²)
mean 1.351 ms (1.342 ms .. 1.362 ms)
std dev 34.67 μs (27.39 μs .. 47.55 μs)
variance introduced by outliers: 15% (moderately inflated)
benchmarking Tests/takeRowsStream 1000 -- same as above but with Streamly
time 5.912 ms (5.885 ms .. 5.933 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 6.005 ms (5.974 ms .. 6.056 ms)
std dev 121.0 μs (89.25 μs .. 185.5 μs)
benchmarking Tests/takeRowsStream 500
time 3.208 ms (3.192 ms .. 3.221 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 3.213 ms (3.201 ms .. 3.226 ms)
std dev 39.64 μs (28.16 μs .. 55.11 μs)
benchmarking Tests/takeRowsStream 200
time 1.525 ms (1.510 ms .. 1.539 ms)
0.998 R² (0.997 R² .. 0.999 R²)
mean 1.651 ms (1.618 ms .. 1.694 ms)
std dev 128.2 μs (104.1 μs .. 168.8 μs)
variance introduced by outliers: 58% (severely inflated)
benchmarking Tests/takeRowsStream 100
time 996.8 μs (990.1 μs .. 1.005 ms)
0.999 R² (0.999 R² .. 1.000 R²)
mean 1.005 ms (998.2 μs .. 1.013 ms)
std dev 24.47 μs (20.43 μs .. 31.05 μs)
variance introduced by outliers: 13% (moderately inflated)
So Streamly
seems to be on average 20% quicker. However it seems to consume more memory:
With a 500-row frame:
Case Allocated GCs
Vec 900,920 0
Stream 971,784 0
takeRowsVec 500 501,968 0
takeRowsVec 200 504,712 0
takeRowsVec 100 506,240 0
takeRowsStream 500 605,368 0
takeRowsStream 200 546,632 0
takeRowsStream 100 519,648 0
1000:
Case Allocated GCs
Vec 1,873,304 1
Stream 2,004,552 1
takeRowsVec 500 1,040,648 1
takeRowsVec 200 1,046,944 1
takeRowsVec 100 1,046,832 1
takeRowsStream 500 1,151,240 1
takeRowsStream 200 1,083,864 1
takeRowsStream 100 1,062,608 1
1500:
Case Allocated GCs
Vec 2,864,264 2
Stream 3,046,216 2
takeRowsVec 500 1,582,736 1
takeRowsVec 200 1,587,816 1
takeRowsVec 100 1,589,776 1
takeRowsStream 500 1,684,936 1
takeRowsStream 200 1,631,704 1
takeRowsStream 100 1,608,144 1
from dh-core.
I took a look at the original repo, and I stumbled upon ejconlon/analyze#3 if it's of any relevance here
from dh-core.
Streamly seems to be on average 20% quicker. However it seems to consume more memory
this is where it gets tricky and possibly more eyes on the problem would be beneficial. @Magalame Could you send in your benchmark code as a PR? Thanks!
from dh-core.
Related Issues (20)
- datasets: add unit tests HOT 7
- datasets : split off datasets-core
- analyze: generate and check random test data HOT 17
- datasets : add ARFF format HOT 10
- datasets: remove data-default-class HOT 1
- datasets: Add fashion-mnist
- Cross validation layer HOT 1
- Bump Stackage to latest Nightly
- datasets: fix benchmark dataset folder HOT 1
- Add test coverage HOT 1
- Cannot build project on macOS 10.14.3 HOT 4
- dense-linear-algebra : Add chronos-bench benchmarks HOT 4
- dense-linear-algebra : Weird memory and runtime behavior from `generateSym` HOT 2
- dense-linear-algebra: Getting stream fusion to work across `Matrix`'s HOT 5
- Add QMNIST
- dense-linear-algebra: Add support for SIMD instructions
- BostonHousing data set URL needs to be updated. HOT 1
- Cut new release
- Move CI from Travis to Github Actions
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dh-core.