We measure as slower than mtl —apparently on average?—

It’s worth noting that the eff and <code class="notra

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

According to <a href="https://github.com/haskell-effectful/effectful/tree/master/bench

Why are we slower than mtl? about fused-effects HOT 7 OPEN

robrix commented on September 27, 2024 1

Why are we slower than mtl?

from fused-effects.

Comments (7)

re-xyr commented on September 27, 2024 2

I'm recently trying to locate performance problems by analyzing generated Core code, so I'll try to write up an explanation about why fused-effects is slower than mtl when the program is not specialized. We will consider this program:

import qualified Control.Carrier.State.Strict as F
import qualified Control.Monad.Identity       as M
import qualified Control.Monad.State.Strict   as M

programFused :: F.Has (F.State Int) sig m => m Int
programFused = do
  x <- F.get @Int
  if x == 0
    then pure x
    else do
      F.put (x - 1)
      programFused
{-# NOINLINE programFused #-}

countdownFused :: Int -> (Int, Int)
countdownFused n = F.run $ F.runState n programFused

programMtl :: M.MonadState Int m => m Int
programMtl = do
  x <- M.get
  if x == 0
    then pure x
    else do
      M.put (x - 1)
      programMtl
{-# NOINLINE programMtl #-}

countdownMtl :: Int -> (Int, Int)
countdownMtl n = M.runState programMtl n

First, this is the Core for programMtl and programFused:

programMtl
  = \ (@(m :: Type -> Type))
      (apDict :: Applicative m)
      ((>>=) :: forall a b. m a -> (a -> m b) -> m b)
      ((>>) :: forall a b. m a -> m b -> m b)
      (get :: m Int)
      (put :: Int -> m ()) ->
      let {
        recurse :: m Int
        recurse = programMtl dict (>>=) (>>) get put } in
        get >>=
        \ (x :: Int) ->
           case x of wild { I# x1 ->
           case x1 of wild1 {
             __DEFAULT -> put (I# (-# wild1 1#)) >> recurse;
             0# -> pure apDict wild
           }
           }

programFused
  = \ (@(sig :: (Type -> Type) -> Type -> Type))
      (@(m :: Type -> Type))
      (memDict :: Members (State Int) sig)
      (apDict :: Applicative m)
      ((>>=) :: forall a b. m a -> (a -> m b) -> m b)
      ((>>) :: forall a b. m a -> m b -> m b)
      (alg
         :: forall (ctx :: Type -> Type) (n :: Type -> Type) a.
            Functor ctx =>
            Handler ctx n m -> sig n a -> ctx () -> m (ctx a)) ->
      let {
        recurse :: m Int
        recurse = programFused memDict apDict (>>=) (>>) alg } in
      let {
        monadDict :: Monad m
        monadDict = C:Monad apDict (>>=) (>>) } in
      let {
        algDict :: Algebra sig m
        algDict = C:Algebra monadDict alg } in
      let {
        hasDict :: (Member (State Int) sig, Algebra sig m)
        hasDict = (memDict, algDict) } in
      get hasDict >>=
        \ (x :: Int) ->
           case x of wild { I# x1 ->
           case x1 of wild1 {
             __DEFAULT ->
               put hasDict (I# (-# wild1 1#)) >> recurse;
             0# -> pure apDict wild
           }
           }

Out of the gate we can see fused-effects has some extra overhead: we need to allocate for 3 instance dicts. But obviously that's not the whole story: what happens when we instantiate the effects?

countdownMtl
  = programMtl
        apStateT
        monadStateT_>>=
        monadStateT_>>
        getStateT
        putStateT

countdownFused
  = programFused
      injStateC
      apStateC
      monadStateC_>>=
      monadStateC_>>
      algStateC

StateT and StateC are virtually the same, so are their Applicative and Monad code. So these are not the reason for the performance difference. Then, it comes down to other definitions, namely getStateT and putStateT for mtl and inj and algStateC for fused-effects. For mtl:

getStateT = \ (eta :: Int) -> (eta, eta)
putStateT = \ (s1 :: Int) _ -> ((), s1)

Looks pretty clean. What about fused-effects?

injStateC
  = \ (@(m :: Type -> Type)) (@a) (op :: State Int m a) -> L op

algStateC
  = \ (@s)
      (@(ctx :: Type -> Type))
      (@(n :: Type -> Type))
      (@a)
      (functorCtxDict :: Functor ctx)
      (handle :: Handler ctx n (StateC s Identity))
      (sig :: (:+:) (State s) (Lift Identity) n a)
      (ctx :: ctx ()) ->
      let {
        functorSCtxDict :: Functor (Compose ((,) s) ctx)
        functorSCtxDict = functorComposeDict functor(,)Dict functorCtxDict } in
      let {
        thread
          :: forall {x}.
             Compose ((,) s) ctx (n x) -> Identity (Compose ((,) s) ctx x)
        thread
          = \ (@x) (x1 :: Compose ((,) s) ctx (n x)) ->
              case x1 of { (x2, y) -> handle y x2 } } in
      (\ (s :: s) ->
         case sig of {
           L stateOp ->
             case stateOp of {
               Get co -> (s, <$ functorCtxDict s ctx);
               Put co s1 -> (s1, ctx)
             };
           R other ->
             case other of { LiftWith with ->
             (with functorSCtxDict thread ((s, ctx) ))
             }
         })

...I guess you can see the problem here. The "algebra" that fused-effects used (instead of plain mtl-style typeclasses) does a lot of wiring for you that otherwise would need to be written manually, but the problem is this mechanism isn't as efficient as mtl when the user program is not specialized.

from fused-effects.

robrix commented on September 27, 2024 1

Sure, but that doesn't address why there's a delta between mtl and us, just why there's a delta between intra- and inter-module perf.

from fused-effects.

patrickt commented on September 27, 2024 1

For sure. Just fretting about being able to actually measure and quantify these differences when it comes to real code, hah.

from fused-effects.

patrickt commented on September 27, 2024

It’s worth noting that the eff and polysemy people have found that GHC sometimes fails to fuse/optimize across module boundaries. This could be a real problem for actually getting a grip on how we perform IRL.

from fused-effects.

robrix commented on September 27, 2024

@patrickt In that case we should be seeing the same factors with mtl, no?

from fused-effects.

patrickt commented on September 27, 2024

Yes, I believe it's common to all systems, and won't be remedied until Alexis's patch for delimited continuations in GHC. Don't quote me on that though.

from fused-effects.

arybczak commented on September 27, 2024

According to a benchmark fused-effects is between 50% and 100% slower than mtl.

If you take the code and profile it, you might be able to find out why that is as I didn't really investigate.

it's common to all systems

It's not. effectful and cleff are fast, but they use completely different implementation technique. freer-simple is quite fast even though it uses free monads underneath. It's just mtl, polysemy and fused-effects that are slow.

from fused-effects.

Why are we slower than mtl? about fused-effects HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent