fpco / weigh Goto Github PK

View Code? Open in Web Editor NEW

92.0 14.0 16.0 94 KB

Measure allocations of a Haskell functions/values

License: BSD 3-Clause "New" or "Revised" License

Haskell 100.00%

weigh's Introduction

weigh

Measures the memory usage of a Haskell value or function

Limitations

⚠️ Turn off the -threaded flag, otherwise it will cause inconsistent results.

Example use

import Weigh

main :: IO ()
main =
  mainWith
    (do func "integers count 0" count 0
        func "integers count 1" count 1
        func "integers count 10" count 10
        func "integers count 100" count 100)
  where
    count :: Integer -> ()
    count 0 = ()
    count a = count (a - 1)

Output results:

Case	Allocated	GCs
integers count 0	16	0
integers count 1	88	0
integers count 10	736	0
integers count 100	7,216	0

Output by default is plain text table; pass --markdown to get a markdown output like the above.

weigh's People

Contributors

Stargazers

Watchers

Forkers

cocreature mitchellwrosen marcelinevq futtetennista acowley jean-lopes andrewthad lpaulmp agrue psibi andrewdmeier migamake lehins teofilc locallycompact input-output-hk

weigh's Issues

mainWith cannot be used more than once

This program

main IO ()
main = do
  mainWith someWeighAction1
  mainWith someWeighAction2

will crash with

No such case!

I know, it is rather obvious and the name kind of implies it should be used only once (main, duh). I am not sure what to do with it, perhaps the docs can be updated.

I just wanted to leave it here.

Can weigh be added to stackage?

Measured bytes differs from what is reported by a cost center

I'm working with the following:

module Main (main) where

import Weigh
import Data.IORef.Unboxed hiding (Counter)

main :: IO ()
main = do
  refu <- newIORefU (0 :: Int)
  mainWith $ do
    io "modifyIORefU" (\r -> {-# SCC modify #-} modifyIORefU r (+ 1)) refu

weigh reports that modifyIORefU allocates 48 bytes. However, if I actually run with --case modifyIORefU foo +RTS -T -Pa, GHC tells me that modify (the SCC) allocates 0 bytes. I'm not sure which to believe, but I'm more inclined to believe GHC. I'm working with unboxed IO refs (from unboxed-refs), so I don't believe I'm left with unevaluated thunks - being unboxed the IO ref is naturally strict in what it contains.

As an aside, interestingly just adding a {-# SCC #-} causes the reported allocations to go up, so I'm probably measuring the cost of profiling. Maybe a limitation of the library.

API and versioning

With the 0.0.4 release two weeks ago, the types of a few functions (e.g. weighFunc) changed. but only a minor version bump was made.

(If nothing else, this breaks the Stackage LTS release guarantees.)

Why direct build-depend on TemplateHaskell ?

Hi,

Why weigh as a build depend on TemplateHaskell ? It effectively depend on it trough temporary, who depend on exceptions, who depend on TH, but I can't see any TH-related code in weigh...

Not reproducible

Trying to use the internals of weigh as a library, I have the following:

{-# LANGUAGE GADTs, RankNTypes #-}

import Weigh (weighFunc)
import Control.DeepSeq (NFData)
import Data.Int (Int64)

-- | The results from measuring memory usage.
data GetWeight where
  GetWeight :: forall a b. (NFData b) => (a -> b) -> a -> GetWeight

runGetWeight :: GetWeight -> IO Weight
runGetWeight (GetWeight f a) = (\(b,gc,_,_) -> Weight b gc) <$> weighFunc f a

data Weight = Weight { bytesAlloc :: !Int64
                     , numGC      :: !Int64
                     }
            deriving (Eq, Ord, Show, Read)

Trying to use this in ghci:

λ> import qualified Data.ByteString as SB
λ> let w = GetWeight (SB.length . SB.pack) (replicate 1000000 0)
λ> runGetWeight w
Weight {bytesAlloc = 57000096, numGC = 1}
λ> runGetWeight w
Weight {bytesAlloc = 1001280, numGC = 0}
λ> runGetWeight w
Weight {bytesAlloc = 1001280, numGC = 0}
λ> runGetWeight w
Weight {bytesAlloc = 1001472, numGC = 0}
λ> runGetWeight w
Weight {bytesAlloc = 1001280, numGC = 0}
λ> runGetWeight w
Weight {bytesAlloc = 1001280, numGC = 0}

(The 1001472 case is odd, but this exact chain of results is reproducible.)

My best guess is that this occurs due to caching or something like that; I actually have this occuring when trying to use my above snippet of code after running tests or criterion benchmarks; they don't refer to the GetWeight type, but are constructed from the same functions and values that are passed to it.

Doesn't build with GHC head

Fails with:

src/Weigh/GHCStats.hs:19:22: error:
    • Not in scope: type constructor or class ‘GHC.Stats.GCStats’
      Perhaps you meant one of these:
        ‘GHC.Stats.RTSStats’ (imported from GHC.Stats),
        ‘GHC.Stats.GCDetails’ (imported from GHC.Stats)
      Module ‘GHC.Stats’ does not export ‘GCStats’.
    • In the Template Haskell quotation ''GHC.Stats.GCStats
      In the untyped splice:
        $(do info <- reify ''GHC.Stats.GCStats
             case info of
               TyConI (DataD _ _ _ _ [RecC _ fields] _)
                 -> do ...
                 where
                     headerSize = ...
                     ....
               _ -> fail
                      ("Unexpected shape of GCStats data type. "
                         ++ "Please report this as a bug, this function "
                         ++ "needs to be updated to the newer GCStats type."))
   |
19 |   $(do info <- reify ''GHC.Stats.GCStats
   |                      ^^^^^^^^^^^^^^^^^^^

descriptions of weighFunc and weighAction are the same

I suspect weighAction might actually be for weighing IO actions instead.

Organize Weigh measures

Hi,

I want to benchmark graph libraries size, I have a lot of benchmarks to do and I want to organize them. I cannot organize them "by hand" like in the example in the README, because I want my tests to be in different files (a file for a library).

Is something possible to handle this case ? I like the way Criterion handle that (with bgroup), because I can after navigate through groups, and sort them by their name for example. I don't know if it is possible with the current implementation of Weigh...

Meaning of "Live" column different for GHC versions 8.2 and later

Great library, very useful! I was trying to use it to measure the size of a large data structure, but the value in the "Live" column was much larger than I expected. Browsing the repo history, I saw that when weigh was updated to support GHC 8.2, the source of this value was changed from currentBytesUsed in GCStats to cumulative_live_bytes in RTSStats. However, the former represents "Number of live bytes at the end of the last major GC" while the latter represents "Sum of live bytes across all major GCs". So newer versions of GHC will report larger values (depending on the number of major GCs). e.g. I have an example that produces this for GHC 8.0.2:

Case   Allocated  GCs        Live         Max
blah  80,000,000  153  39,999,688  39,999,688

And this for GHC 8.2.2:

Case   Allocated  GCs         Live         Max
blah  80,000,000   76  126,204,912  39,999,752

(I can share the source if it's helpful). Both interpretations could be useful, but to be consistent with older versions of both GHC and weigh it seems like the value should instead come from gcdetails_live_bytes for the most recent GCDetails (i.e. gcdetails_live_bytes . gc). When I make this change the "Live" and "Max" columns have the same value again in GHC 8.2.

If you agree with this change I can prepare a PR.

Is it possible to add some sort of "max residency" statistics?

As I understand currently the library displays total number of allocated bytes. With streaming approach this number may be misleading, as it may appear as if memory consuption is significant, but max residency could be indeed still low. If we display max residency info along with the info weigh already displays, it may help detect memory leaks perhaps?

Problem when running Actions whith the same name in different group

Hi,

Currently, you if you try to run a Weigh like:

main :: IO ()
main = mainWith
    (wgroup "parent" $ do
      wgroup "fst" (func "id" id (1::Int) )
      wgroup "snd" (func "id" nub ([1,2,3,4,5,1]::[Int])))

The result will be the same :

parent

  fst

    Case  Allocated  GCs
    id            0    0
  
  snd
  
    Case  Allocated  GCs
    id            0    0

But if I comment out the first wgroup, I get:

parent

  snd
  
    Case  Allocated  GCs
    id          496    0

I investigated, and found that the WEIGH_CASE environment variable is set with the actionName field, which is the same (there is no prefix added to it), so I presume it is always the same benchmark returned by gelem.

It seems that there is a duplication of information between the actionName record and the name of a Singleton (and the Singletion field is set with a correct prefix :) )

Inaccurate GHCStats size for 32 bit arch

I see some hard-coded 8s in Weigh.GHCStats that assume a 64 bit architecture. I can think of two ways to fix this:

Replace 8 with SIZEOF_HSWORD
Replace the generic traversal with something like:

-- Get the size of an arbitrary closure
sizeOf :: a -> Int64                                                                                                       
sizeOf x =                                                                                                                 
  case unpackClosure# x of                                                                                                 
    (# _, xs, ys #) ->                                                                                                    
      case SIZEOF_HSWORD of                                                                                                
        I# wordSize ->                                                                                                     
          I64# (wordSize +# (wordSize *# (sizeofArray# xs)) +# sizeofByteArray# ys)

ghcStatsSizeInBytes :: Int64
ghcStatsSizeInBytes = $([| sizeOf $! GCStats 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |])

Tests fail on 32bit machines

Tests fail on 32bit machines with the following error:

Running 1 test suites...
Test suite weigh-test: RUNNING...

Case                             Allocated  GCs
integers count 0                        72    0
integers count 1                        88    0
integers count 2                       104    0
integers count 3                       120    0
integers count 10                      232    0
integers count 100                   1,672    0
integers count IO CAF 0                 80    0
integers count IO func 0                84    0
integers count IO CAF 1                 96    0
integers count IO func 1               100    0
ints count 1                            72    0
ints count 10                           72    0
ints count 1000000                      72    0
\_ -> IntegerStruct 0 0                 72    0
\x -> IntegerStruct x 0                 84    0
\x -> IntegerStruct x x                 84    0
\x -> IntegerStruct (x+1) x             92    0
\x -> IntegerStruct (x+1) (x+1)         92    0
\x -> IntegerStruct (x+1) (x+2)        100    0
\x -> HasInt x                          80    0
\x -> HasUnpacked (HasInt x)            80    0
\x -> HasPacked (HasInt x)             100    0

Check problems:
  ints count 1
    Allocated bytes exceeds 0: 72
  ints count 10
    Allocated bytes exceeds 0: 72
  ints count 1000000
    Allocated bytes exceeds 0: 72
Test suite weigh-test: FAIL

This used to work, so I tried git bisect and ended up with commit cce0faa. Reverting that commit, makes the tests pass again.

Export reportGroup

Hi,

I would love to use the results provided by weighResults, in a very similar way of reportGroup. Do you think it can be exported ?
The only problem I see, it that will lead to expose Config and Format data structure, and can expand a lot the export list.

(-Edit-) Or do something like the setColumn method, exposing only Format

Or maybe use separated files, to distinct formatting export of the rest ?

Build failures in weigh-0.0.1 to weigh-0.0.5 with recent GHC versions

src/Weigh/GHCStats.hs:21:22: error:
    • Not in scope: type constructor or class ‘GHC.Stats.GCStats’
      Perhaps you meant one of these:
        ‘GHC.Stats.RTSStats’ (imported from GHC.Stats),
        ‘GHC.Stats.GCDetails’ (imported from GHC.Stats)
      Module ‘GHC.Stats’ does not export ‘GCStats’.
    • In the Template Haskell quotation ''GHC.Stats.GCStats
      In the untyped splice:
        $(do info <- reify ''GHC.Stats.GCStats
             case info of
               TyConI (DataD _ _ _ _ [RecC _ fields] _)
                 -> do ...
                 where
                     headerSize = ...
                     ....
               _ -> fail
                      ("Unexpected shape of GCStats data type. "
                         ++ "Please report this as a bug, this function "
                         ++ "needs to be updated to the newer GCStats type."))
   |
21 |   $(do info <- reify ''GHC.Stats.GCStats
   |                      ^^^^^^^^^^^^^^^^^^^

As a Hackage trustee, I have revised these versions to add bounds base < 4.11. See e.g. https://hackage.haskell.org/package/weigh-0.0.5/revisions/.

0.0.6 Fails to compile with GHC-7.8

src/Weigh/GHCStats.hs:62:3: Not in scope: ‘pure’

I made a revision on Hackage: http://hackage.haskell.org/package/weigh-0.0.6/revisions/

Discovered in well-typed/generics-sop#60

Build failures with mtl-2.3

src/Weigh.hs:161:3: error:
    Variable not in scope: unless :: Bool -> IO () -> IO a0
    |
161 |   unless
    |   ^^^^^^

As a Hackage trustee I have revised the affected versions on Hackage. See e.g. https://hackage.haskell.org/package/weigh-0.0.16/revisions/.

Can't build test suite with ghc 8.4 due to bytestring-trie

Adding weigh to skipped-tests on Stackage, due to this build failure with bytestring-trie. I can't find anywhere to report it so here's the error in case anyone knows how to pass it along to the right person.

[4 of 6] Compiling Data.Trie.Internal ( src/Data/Trie/Internal.hs, .stack-work/dist/x86_64-osx/Cabal-2.2.0.0/build/Data/Trie/Internal.o )

/Users/dan/scratch/bytestring-trie-0.2.4.1/src/Data/Trie/Internal.hs:306:10: error:
    • Could not deduce (Semigroup (Trie a))
        arising from the superclasses of an instance declaration
      from the context: Monoid a
        bound by the instance declaration
        at src/Data/Trie/Internal.hs:306:10-38
    • In the instance declaration for ‘Monoid (Trie a)’
    |
306 | instance (Monoid a) => Monoid (Trie a) where
    |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Getting insanely high number of garbage collections

Here is a benchmark I wrote for my MMark markdown processor:

module Main (main) where

import Weigh
import qualified Data.Text.IO as T
import qualified Text.MMark   as MMark

main :: IO ()
main = mainWith $ do
  setColumns [Case, Allocated, GCs, Max]
  bparser "data/bench-paragraph.md"

----------------------------------------------------------------------------
-- Helpers

bparser
  :: FilePath          -- ^ File from which the input has been loaded
  -> Weigh ()
bparser path = action name (p <$> T.readFile path)
  where
    name = "with file: " ++ path
    p    = MMark.parse path

When I run it, I get the following result:

Case                                Allocated            GCs     Max
with file: data/bench-paragraph.md  4,102,096  4,294,967,299  46,160

The values under the "Allocated" and "Max" columns look realistic, but "GCs" is the number of times garbage collection has happened, right? And it says GC has been run more than 4 billion times? Looks like Int overflow or something to me.

I also benchmarked the same thing with Criterion and it shows that parsing of that paragraph takes just 758.9 μs, it's not possible that in this period of time Haskell run time managed to perform 4,294,967,299 garbage collections.

setting file does not exists and empty results

I'm not able to obtain any result using weigh 0.0.16 provided by nix.

I'm using the following nix file to setup ghc 8.6.5 and weigh 0.0.16:

weigh-bug-report.nix

with import (builtins.fetchTarball {
url = https://github.com/NixOS/nixpkgs/archive/3140fa89c51.tar.gz;
sha256 = "18p0d5lnfvzsyfah02mf6bi249990pfwnylwhqdh8qi70ncrk3f8";
}) {};
mkShell {
  buildInputs = [ (pkgs.haskellPackages.ghcWithPackages(p : [
    (
      p.callHackageDirect {
        pkg = "weigh";
        ver = "0.0.16";
        sha256 = "0icdyvxxi7493ch8xlpwn024plspbsdssxmcy5984yar298z8hcw";
      } {}
    )
  ]))];
}

Then, the following invocations:

$ nix-shell ./weigh-bug-report.nix

[nix-shell:~]$ ghc-pkg list | grep weigh
    weigh-0.0.16

[nix-shell:~]$ ghci
GHCi, version 8.6.5: http://www.haskell.org/ghc/  :? for help
Prelude> import Weigh

-- `mainWith` is failing with a surprising error

Prelude Weigh> mainWith $ do func "bar" sum [(1 :: Int)..10]
*** Exception: Error in case ("/bar"):
  /nix/store/jxrl0fcihi2vpxyf0makql4208ggx3fs-ghc-8.6.5/lib/ghc-8.6.5/lib/settings: openFile: does not exist (No such file or directory)

CallStack (from HasCallStack):
  error, called at src/Weigh.hs:384:12 in weigh-0.0.16-GXMBccy3Rx4880KgeLr7zE:Weigh

-- Note that other invocation are not failing, but returns nothing:

Prelude Weigh> mainWith $ do func "bar" sum [(1 :: Int)..10]
Prelude Weigh> weighResults $ func "bar" sum [1..10]
([],Config {configColumns = [Case,Allocated,GCs], configPrefix = "", configFormat = Plain})

As you can see, it fails with a file does not exist error. It may be due to the non standard file layout of nix.

Maximum residency isn't right

I did a benchmark like this:

strictLength :: FilePath -> IO Int
strictLength fname = do
  !bs <- BS.readFile fname
  return $ BS.length bs

conduitLength :: FilePath -> IO Int
conduitLength fname = runConduitRes $ sourceFile fname .| sumCo 0

sumCo :: Monad m => Int -> ConduitT BS.ByteString o m Int
sumCo !acc = do
  val :: Maybe ByteString <- await
  case val of
    Just v -> sumCo (acc + (BS.length v))
    Nothing -> pure acc

This was the result that was being shown:

IO Function

  Case              Allocated     Max    Live    GCs
  strict read   2,147,506,008   9,752   9,752      1
  conduit read  2,294,452,152  10,072  10,072  2,049

Now that is clearly wrong as max residency for each of them should be different. With my PR, the problem is gone and it shows a more proper result:

IO Function

  Case              Allocated            Max    Live    GCs
  strict read   2,147,506,008  2,148,532,224   9,752      1
  conduit read  2,294,452,152      1,048,576  10,072  2,049
Benchmark weigh-bench: FINISH

PR fixing it: #39

Error when using with criterion

import Criterion.Main (defaultMain)
import Weigh (mainWith)

main :: IO ()
main = do
  defaultMain []
  mainWith $ func "f" ((++) "a") "b"

is failing with error:

Error in case ("f"):
  Invalid option `--case'

Usage: containers [-I|--ci CI] [-L|--time-limit SECS] [--resamples COUNT]
                  [--regress RESP:PRED..] [--raw FILE] [-o|--output FILE]
                  [--csv FILE] [--json FILE] [--junit FILE]
                  [-v|--verbosity LEVEL] [-t|--template FILE]
                  ([-m|--match MATCH] [NAME...] | [-n|--iters ITERS]
                  [-m|--match MATCH] [NAME...] | [-l|--list] | [--version])

CallStack (from HasCallStack):
  error, called at src/Weigh.hs:340:12 in weigh-0.0.11-E2ZrXVWrFMK988oLKQqdc:Weigh

Note that the interchanging the order (using defaultMain after mainWith) produce the same thing. And commenting one line or the other run without problems.

Implementation of 'fork'

Hi, I'm using Weigh as part of a system I'm building for automated testing. The fact that Weigh spawns a copy of the executable per test case (related issue #22) is causing problems for my system because it performs subsequent analysis on results from multiple test cases simultaneously (e.g., regression analysis). I've just hacked together a new version of the fork function that uses the async package:

fork :: Action  -> IO Weight
fork (Action !run !arg !name _) = do
  sync <- async $ do
    (bytes, gcs, liveBytes, maxByte) <- case run of
      Right f -> weighFunc   f arg
      Left a  -> weighAction a arg
    return Weight { weightLabel          = name
                  , weightAllocatedBytes = bytes
                  , weightGCs            = gcs
                  , weightLiveBytes      = liveBytes
                  , weightMaxBytes       = maxByte
                  }
  wait sync

I did some tests with the new code, and the results seem to be pretty much the same as the existing code, but maybe I'm missing something. So, I was just wondering if there was a particular reason why the existing implementation spawns a copy of the executable, e.g., from an RTS/memory management perspective?

Many thanks

Writing output results in parse error

Parsing the output from forked processes fails when you do output yourself, e.g.:

main = do putStrLn "Hello!"
          mainWith (do ...

will fail with

program: Prelude.read: no parse

I don't think that this necessarily needs to be fixed, but it certainly should be documented.

Inconsistent result, bytes in negative

Using weigh, I'm getting inconsistent result on re runs and also negative amount of bytes allocated.

~/g/f/g/cryto-perf $ stack exec cryto-perf-exe

Case                          Allocated  GCs
\x -> HasInt x                       16    0
\x -> HasPacked (HasInt x)           56    0
\x -> HasUnpacked (HasInt x)      -,280    0
~/g/f/g/cryto-perf $ stack exec cryto-perf-exe

Case                          Allocated  GCs
\x -> HasInt x                       16    0
\x -> HasPacked (HasInt x)           56    0
\x -> HasUnpacked (HasInt x)         16    0
~/g/f/g/cryto-perf $ stack exec cryto-perf-exe

Case                          Allocated  GCs
\x -> HasInt x                    -,280    0
\x -> HasPacked (HasInt x)        -,264    0
\x -> HasUnpacked (HasInt x)         16    0

Click to expand the code used to reproduce the above error

{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE BangPatterns #-}

module Main where

import Control.DeepSeq
import GHC.Generics
import Weigh

data HasInt =
  HasInt !Int
  deriving (Generic)

instance NFData HasInt

data HasPacked =
  HasPacked HasInt
  deriving (Generic)

instance NFData HasPacked

data HasUnpacked =
  HasUnpacked {-# UNPACK #-}!HasInt
  deriving (Generic)

instance NFData HasUnpacked

packing :: Weigh ()
packing = do
  func "\\x -> HasInt x" (\x -> HasInt x) 5
  func "\\x -> HasPacked (HasInt x)" (\x -> HasPacked (HasInt x)) 5
  func "\\x -> HasUnpacked (HasInt x)" (\x -> HasUnpacked (HasInt x)) 5

main :: IO ()
main = mainWith packing

Other details:

GHC: 8.6.3
weigh vesion: weigh 0.0.13

Architecture / Distribution:

$ uname -a
Linux elric 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC 2018 x86_64 x86_64
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.1 LTS
Release:        18.04
Codename:       bionic

Test failure (due to GHC 8.2.2?)

Getting this while building nightly. I'm not sure if I should remove weigh from the 8.2.2 nightly or just disable the test suite, what do you think? I also can't guarantee it's a GHC-8.2.2 specific issue, but this didn't appear until the upgrade.

> /tmp/stackage-build12/weigh-0.0.5$ dist/build/weigh-test/weigh-test

Case                             Allocated            GCs
integers count 0                        16  4,294,967,296
integers count 1                        32  4,294,967,296
integers count 2                        48  4,294,967,296
integers count 3                        64  4,294,967,296
integers count 10                      176  4,294,967,296
integers count 100                   1,616  4,294,967,296
integers count IO CAF 0                 32  4,294,967,296
integers count IO func 0                40  4,294,967,296
integers count IO CAF 1                 48  4,294,967,296
integers count IO func 1                56  4,294,967,296
ints count 1                            16  4,294,967,296
ints count 10                           16  4,294,967,296
ints count 1000000                      16  4,294,967,296
\_ -> IntegerStruct 0 0                 16  4,294,967,296
\x -> IntegerStruct x 0                 40  4,294,967,296
\x -> IntegerStruct x x                 40  4,294,967,296
\x -> IntegerStruct (x+1) x             56  4,294,967,296
\x -> IntegerStruct (x+1) (x+1)         56  4,294,967,296
\x -> IntegerStruct (x+1) (x+2)         72  4,294,967,296
\x -> HasInt x                          32  4,294,967,296
\x -> HasUnpacked (HasInt x)            32  4,294,967,296
\x -> HasPacked (HasInt x)              72  4,294,967,296

Check problems:
  ints count 1
    Allocated bytes exceeds 0: 16
  ints count 10
    Allocated bytes exceeds 0: 16
  ints count 1000000
    Allocated bytes exceeds 0: 16

Results depending on function ordering

Hi!

First of all this is an amazing library! There is no say in how useful it is to analyze stream fusion

Secondly, I'm having a pretty bad problem. Consider this:

module Main where

import qualified Statistics.Matrix as M
import           Statistics.Matrix (Matrix (..))

import qualified Data.Vector.Unboxed         as U
import           Data.Vector.Unboxed         (Vector)

import qualified System.Random.MWC as Mwc

import qualified Weigh         as W 

n :: Int
n = 100

testVector :: IO (Vector Double)
testVector = do 
    gen <- Mwc.create
    Mwc.uniformVector gen (n*n)


testMatrix :: IO Matrix
testMatrix = do
    vec <- testVector
    return $ Matrix n n vec


innerProduct :: Vector Double -> Vector Double -> Double
innerProduct u v = U.sum $ U.zipWith (*) u v

colm :: Int -> Matrix -> Vector Double
colm j m = U.generate n (\i -> v `U.unsafeIndex` (j + i * n))
  where v = _vector m 
{-# INLINE colm #-}

colv :: Int -> Vector Double -> Vector Double
colv j v = U.generate n (\i -> v `U.unsafeIndex` (j + i * n))
{-# INLINE colv #-}


weight :: Vector Double -> Matrix  -> IO ()
weight v a  = do
    W.mainWith (do 
        W.func "innerProduct vector" (innerProduct (colv 0 v)) (colv 0 v)
        W.func "innerProduct matrix" (innerProduct (colm 0 a)) (colm 0 a)
        )
main :: IO ()
main = do
    v <- testVector
    a <- testMatrix

    putStrLn "---Benchmarking memory consumption---"
    weight v a

gives:

---Benchmarking memory consumption---     
                                          
Case                   Allocated  GCs     
innerProduct vector          864    0     
innerProduct matrix         16    0

However, if we swap the "innerProduct vector" line with the "innerProduct matrix" line, we have:

---Benchmarking memory consumption---     
                                          
Case                   Allocated  GCs     
innerProduct matrix        864    0     
innerProduct vector           16    0

So in one case, the vector case allocates 16 bytes, in the other it allocates 16. I'm not sure where it is due to weigh itself, or some arcane feature of GHC, but in either case it is a tad worrying.

Build failure with GHC-8.0.1

The current HEAD of weigh doesn't build with GHC-8.0.1:

~/s/weigh (master) $ stack init --force --resolver ghc-8.0.1
~/s/weigh (master) $ stack build
...
/home/simon/src/weigh/src/Weigh/GHCStats.hs:20:18: error:
    • The constructor ‘DataD’ should have 6 arguments, but has been given 5
    • In the pattern: DataD _ _ _ [RecC _ fields] _
      In the pattern: TyConI (DataD _ _ _ [RecC _ fields] _)
      In a case alternative:
          TyConI (DataD _ _ _ [RecC _ fields] _)
            -> do { total <- fmap
                               (foldl' (+) headerSize) (mapM fieldSize fields);
                    litE (IntegerL (fromIntegral total)) }
            where
                headerSize = 8
                fieldSize :: (name, strict, Type) -> Q Int64
                fieldSize (_, _, typ)
                  = case typ of {
                      ConT typeName -> ...
                      _ -> fail
                             ("Unexpected type shape: "
                              ++
                                show typ
                                ++ ". Please report this as a bug, the codebase needs updating.") }
                knownTypes :: [(Name, Int64)]
                ....

--  While building package weigh-0.0.0 using:
      /home/simon/.stack/setup-exe-cache/x86_64-linux/setup-Simple-Cabal-1.24.0.0-ghc-8.0.1 --builddir=.stack-work/dist/x86_64-linux/Cabal-1.24.0.0 build lib:weigh --ghc-options " -ddump-hi -ddump-to-file"
    Process exited with code: ExitFailure 1

Somewhat confusing results

I have a program that looks like that:

main = do
  mainWith $ do
    action "io1" io1
    action "io2" io2
    action "io3" io3

It returns

Case  Allocated             GCs
io1     3,409,611,320    2,062
io2     9,096,608           8
io3     8,399,944           7

However, when I do this

main = do
  io1
  io2
  io2
  mainWith $ do
    action "io1" io1
    action "io2" io2
    action "io3" io3

I get this

Case  Allocated             GCs
io1     2,154,904,000    2,062
io2     6,503,280           5
io3     2,777,026,064    2,657

The latter seems more real (because I know that io3 requires a lot of memory.

What is going on here? I guess I am missing some important detail and don't quite understand the ideas that power the library.

Export weighFunc/weighAction

I'd like to have access to these internal functions, would it be too much trouble to export them?