Hi all! In robyn_allocator when mean carryover is calculated it uses

Hi! I share the view from <a class="user-mention notranslate" data-hovercard-type="use

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

inflation_total[[i]] <- resp$inflation_total <p dir=

Yes, inflation_total was calculated <code class="notr

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hi! I share the view from <a class="user-mention notranslate" data-hoverc

Robyn Allocator underestimates Carryover ,about facebookexperimental/robyn

Comments (23)

gufengzhou commented on June 8, 2024 2

Our choice of solution has a reason: it's important to separate immediate & carryover as component of total due to the need for more accurate calibration, see here https://medium.com/@gufengzhou/more-precision-in-mmm-experiment-calibration-with-robyn-from-meta-marketing-science-f608841fc6d4

Thus having it consistently in robyn_response as well as in the allocator is important for the interpretation of the result.

There's definitely room for improvement for better interpretable results. Will communicate our solution when ready.

from robyn.

m4x3 commented on June 8, 2024 1

Hi! I share the view from @Crypto1993 that the Carryover Spend is currently undervalued in the budget allocator.

The following example should help illustrate the point:
Assume the budget allocator is run for the last 10 days. A partner has 100 € total spend which is equally spread across the 10 days (i.e. 10 € per day). The avg. carryover spend is 3 €.

Now what happens in the budget allocator?

https://github.com/facebookexperimental/Robyn/blob/2ad462fffae4191ee7a564182235a9f0e5decfb2/R/R/allocator.R#L254C1-L273

for (i in seq_along(mediaSpendSorted)) {
    resp <- robyn_response(
      json_file = json_file,
      # robyn_object = robyn_object,
      select_build = select_build,
      select_model = select_model,
      metric_name = mediaSpendSorted[i],
      metric_value = initSpendUnit[i],
      date_range = date_range,
      dt_hyppar = OutputCollect$resultHypParam,
      dt_coef = OutputCollect$xDecompAgg,
      InputCollect = InputCollect,
      OutputCollect = OutputCollect,
      quiet = TRUE,
      is_allocator = TRUE,
      ...
    )
    # val <- sort(resp$response_total)[round(length(resp$response_total) / 2)]
    # histSpendUnit[i] <- resp$input_immediate[which(resp$response_total == val)]
    hist_carryover[[i]] <- resp$input_carryover
    ...

mean(hist_carryover[[i]]) is later used in all of the optimization functions to account for carryover adstock
robyn_response() is run for the selected date_range (10 days) with initSpendUnit[i] (10 €) as input for the matric_value.
In this case, robyn_response simulates the expected response when spending a total amount of 10 € over 10 days which would amount to 1 € spend per day. At a spend level of 1 € per day the carryover spend & response is obviously much lower than when spending 10 € per day.

I think of the budget allocator as simulating different spend levels and finding the optimal spend level per partner. So, if we now simulate a spend level of e.g. 12 € per day, we should in my opinion not take the avg. carryover spend from spending 1 € per day, but rather the one from spending 10 € per day. And even that I don't find ideal, because the carryover spend would differ if you spend 50 % more or less than in the previous period.

In our case, we mainly run always-on marketing campaigns. We therefore now try to calculate/simulate the actual avg. carryover spend at a continuous spend level of X € which would be X € * (theta / (1-theta)) and use that logic across the whole budget allocator including the optimization functions. I would be interested in your thoughts on that @gufengzhou

from robyn.

m4x3 commented on June 8, 2024 1

@Crypto1993 I suppose you also changed the robyn_response() function because the inflation_total is not in the normal function output.

I just want to add how we changed it.:

fx_objective <- function(x, coeff, alpha, theta, inflexion, x_hist_carryover, get_sum = TRUE) {

  # xAdstocked <- x + mean(x_hist_carryover)
  
  xAdstocked <- x + x * (theta / (1-theta)) + 0.01
  
  # Hill transformation
  if (get_sum) {
    xOut <- coeff * sum((1 + inflexion**alpha / xAdstocked**alpha)**-1)
  } else {
    xOut <- coeff * ((1 + inflexion**alpha / xAdstocked**alpha)**-1)
  }
  return(xOut)
}

x * (theta / (1-theta)) is the convergence carryover spend that you would have every day, if you spend x € for a larger number of days. This way the value now only depends on x and theta and not on the historic data. (I added 0.01 because otherwise the optimization wouldn't start for channels with a lower constraint of 0 € spend.) I had to adjust the code in multiple places to also have the theta values available everywhere.

I guess it's quite similar to using the inflation_rate as suggested by @Crypto1993

from robyn.

gufengzhou commented on June 8, 2024 1

inflation_total[[i]] <- resp$inflation_total

You changed the response function too? Well our implementation takes historical carryover as a base value to add to any desired immediate spend as the adstocked total, while you want to use a multiplier. The intention is the same. Should'nt be any difference if implemented correctly.

from robyn.

Crypto1993 commented on June 8, 2024 1

Yes, inflation_total was calculated adstock_transform I've added in the output of robyn_response. In your solution did you also changed in the fx_gradient? Thanks for sharing your solution! it's great that we independently found some similar solution! @m4x3

from robyn.

gufengzhou commented on June 8, 2024 1

Hey @m4x3, now I got it and you're right! Thank you very much for pointing it out! The "good news" is that the object resp is "only" used to simulate the average carryover within the date_range, and the simulated response is done using the fx_objective(). So the damage was limited, but still it was a bug! Thank you again. Just pushed a fix.

from robyn.

gufengzhou commented on June 8, 2024

Hi, thanks for your thoughts. We had a different view on this, namely we believe it's about having the "latest" carryover in the budget allocation that should better reflect actionability in business. Having some huge historical carryover from a year a go doesn't feel right.

Besides that, technically speaking, saturation curve always reflects diminishing behaviour on the unit of your data time grain (weekly or daily or monthly, whatever you provide). I need to use a mean spend. histSpendWindow is sum of spend. Does it make sense?

from robyn.

Crypto1993 commented on June 8, 2024

Hi,
Using HistSpendWindow to calculate average carryover in the time window used to reallocate budget better reflects where you are on the saturation curve since it takes into account frequency of investment in that timeframe. If you use just initSpendUnit you are assuming a single investment of average amount for all that timeframe. If you plot saturations curves from Robyn output csv with AdstockedMedia or AdstockedSpend with their relative response you can see that on average you are higher in the suturation curve for the same period the allocator is trying to optimize, this leads to have a a lover point of initial spent in allocators one pager curves.

For small time windows this is not a clear problem, but if you try to evaluate how much "space" of optimization there is in ones spent you are overestimating channels effectiveness since you reduced carryover.

One possible solution to use HistSpendWindow to reflect how much carryover you have on average every week assuming you spent the average amount initSpendWindow for every period. This works for always on channels o very frequent spent, for infrequent spent it's hard to apply since you have a large distortion due to Jensen's Inequality. This solution has also the problem of keeping fixed the amount of carryover in the budget allocator.

A possible better solution may be to use the inflation_total as a multiplier of spent instead of average carryover, in this case the carryover becomes dynamic with the solution, but you may have to adjust the gradient function to accomplish full Long term Optimization.

I agree with you that the allocator now it's a short term optimization and it works fine for this use case, but even in this case would be better to use HistSpendWindow to calculate the average carryover since the results won't be very different.

from robyn.

gufengzhou commented on June 8, 2024

Currently initSpendUnit = histSpendWindow / nr of periods, which is within the date_range. We're aware of the always-on vs. sparse media issue. We came to the conclusion to rather reflect your recent spend level. If channel A has 0 spend in let's say the last 4 periods, we recommend to extend date_range to a period where this channel was on. Alternatively, you can also define specific dates to explicitly cover it when channel A was on.

from robyn.

Crypto1993 commented on June 8, 2024

Hi! I share the view from @Crypto1993 that the Carryover Spend is currently undervalued in the budget allocator.

The following example should help illustrate the point:

Assume the budget allocator is run for the last 10 days. A partner has 100 € total spend which is equally spread across the 10 days (i.e. 10 € per day). The avg. carryover spend is 3 €.

Now what happens in the budget allocator?

https://github.com/facebookexperimental/Robyn/blob/2ad462fffae4191ee7a564182235a9f0e5decfb2/R/R/allocator.R#L254C1-L273
for (i in seq_along(mediaSpendSorted)) {

    resp <- robyn_response(

      json_file = json_file,

      # robyn_object = robyn_object,

      select_build = select_build,

      select_model = select_model,

      metric_name = mediaSpendSorted[i],

      metric_value = initSpendUnit[i],

      date_range = date_range,

      dt_hyppar = OutputCollect$resultHypParam,

      dt_coef = OutputCollect$xDecompAgg,

      InputCollect = InputCollect,

      OutputCollect = OutputCollect,

      quiet = TRUE,

      is_allocator = TRUE,

      ...

    )

    # val <- sort(resp$response_total)[round(length(resp$response_total) / 2)]

    # histSpendUnit[i] <- resp$input_immediate[which(resp$response_total == val)]

    hist_carryover[[i]] <- resp$input_carryover

    ...
mean(hist_carryover[[i]]) is later used in all of the optimization functions to account for carryover adstock

robyn_response() is run for the selected date_range (10 days) with initSpendUnit[i] (10 €) as input for the matric_value.

In this case, robyn_response simulates the expected response when spending a total amount of 10 € over 10 days which would amount to 1 € spend per day. At a spend level of 1 € per day the carryover spend & response is obviously much lower than when spending 10 € per day.

I think of the budget allocator as simulating different spend levels and finding the optimal spend level per partner. So, if we now simulate a spend level of e.g. 12 € per day, we should in my opinion not take the avg. carryover spend from spending 1 € per day, but rather the one from spending 10 € per day. And even that I don't find ideal, because the carryover spend would differ if you spend 50 % more or less than in the previous period.

In our case, we mainly run always-on marketing campaigns. We therefore now try to calculate/simulate the actual avg. carryover spend at a continuous spend level of X € which would be X € * (theta / (1-theta)) and use that logic across the whole budget allocator including the optimization functions. I would be interested in your thoughts on that @gufengzhou

Hi! We modified the budget allocator for the same reasons (we had a lot of always on channels). You can find our solution in my repo Robyn-exposure in the branch called "allocator-experiment".
We modified the gradient and the evaluation function to take into account the inflation_factor (the ratio between adstocked media e media) for every unit spent.

from robyn.

gufengzhou commented on June 8, 2024

In this case, robyn_response simulates the expected response when spending a total amount of 10 € over 10 days which would amount to 1 € spend per day.

Have you tested that 10€ for 10 days is really the case? If yes, then it's definitely a bug. Please let me know and also please make sure you're using the latest version. Thanks! @m4x3

from robyn.

gufengzhou commented on June 8, 2024

We modified the gradient and the evaluation function to take into account the inflation_factor (the ratio between adstocked media e media) for every unit spent.

If this issue is true then it's a bug to be fixed. A while ago Robyn's allocator didn't account for adstock. But as shown above adstock is included already. If you're still using the inflation factor in the gradient you'll be "doubling" the adstocking. @Crypto1993

from robyn.

Crypto1993 commented on June 8, 2024

hi @gufengzhou

I've modified the fx_objective this way:

x_objective <- function(x, coeff, alpha, inflexion, x_hist_carryover, get_sum = TRUE, mm_lm_coefs = NULL) {
  #Apply Michaelis Menten model to scale spend to exposure
  
  xScaled <- x * mm_lm_coefs

  # Adstock scales
  xAdstocked <- xScaled * x_hist_carryover  # + mean(x_hist_carryover)
  # Hill transformation
  if (get_sum) {
    xOut <- coeff * sum((1 + inflexion**alpha / xAdstocked**alpha)**-1)
  } else {
    xOut <- coeff * ((1 + inflexion**alpha / xAdstocked**alpha)**-1)
  }
  return(xOut)
}

and the gradient in this way:

fx_gradient <- function(x, coeff, alpha, inflexion, x_hist_carryover,
                         mm_lm_coefs = NULL
) {
  # Apply Michaelis Menten model to scale spend to exposure

  xScaled <- x * mm_lm_coefs

  # Adstock scales
  xAdstocked <- xScaled * x_hist_carryover  # + mean(x_hist_carryover)
  xOut <- -coeff * mm_lm_coefs * x_hist_carryover * sum((alpha * (inflexion**alpha) * (xAdstocked**(alpha - 1))) / (xAdstocked**alpha + inflexion**alpha)**2)
  return(xOut)
}

where the inflation_factor is the variable that you see as "x_hist_carryover" (I've changed in the code above)

from robyn.

gufengzhou commented on June 8, 2024

Are you using exposure modelling? This is what we advised against esp in the budget allocator. But you probably know what you're doing:)

x_hist_carryover is not a multiplier, it's a vector of historical carryover values. If you haven't changed the calculation of x_hist_carryover for your purpose, I'm afraid it doesn't work this way

from robyn.

Crypto1993 commented on June 8, 2024

Yes I've changed the historical carryover in section of code:

# Response values based on date range -> mean spend
  initResponseUnit <- NULL
  initResponseMargUnit <- NULL
  hist_carryover <- list()
  inflation_total <- list()

  for (i in seq_along(mediaVarsSorted)) {
    resp <- robyn_response(
      json_file = json_file,
      robyn_object = robyn_object,
      select_build = select_build,
      select_model = select_model,
      metric_name = mediaVarsSorted[i],
      metric_value = histMediaVarsWindow[i], 
      date_range = date_range,
      dt_hyppar = OutputCollect$resultHypParam,
      dt_coef = OutputCollect$xDecompAgg,
      InputCollect = InputCollect,
      OutputCollect = OutputCollect,
      quiet = TRUE,
      is_allocator = TRUE,
      ...
    )
    # val <- sort(resp$response_total)[round(length(resp$response_total) / 2)]
    # histSpendUnit[i] <- resp$input_immediate[which(resp$response_total == val)]
    
    hist_carryover[[i]] <- resp$input_carryover
    inflation_total[[i]] <- resp$inflation_total
    # get simulated response
    resp_simulate <- fx_objective(
      x = initSpendUnit[i],
      coeff = coefs_sorted[[mediaVarsSorted[i]]],
      alpha = alphas[[paste0(mediaVarsSorted[i], "_alphas")]],
      inflexion = inflexions[[paste0(mediaVarsSorted[i], "_gammas")]],
      x_hist_carryover = resp$inflation_total,
      mm_lm_coefs = mm_lm_coefs[i],
      get_sum = FALSE
    )
    resp_simulate_plus1 <- fx_objective(
      x = initSpendUnit[i] + 1,
      coeff = coefs_sorted[[mediaVarsSorted[i]]],
      alpha = alphas[[paste0(mediaVarsSorted[i], "_alphas")]],
      inflexion = inflexions[[paste0(mediaVarsSorted[i], "_gammas")]],
      x_hist_carryover = resp$inflation_total,
      mm_lm_coefs = mm_lm_coefs[i],
      get_sum = FALSE
    )
    names(hist_carryover[[i]]) <- resp$date
    initResponseUnit <- c(initResponseUnit, resp_simulate)
    initResponseMargUnit <- c(initResponseMargUnit, resp_simulate_plus1 - resp_simulate)
  }

  names(initResponseUnit) <- names(hist_carryover) <-  names(inflation_total) <- mediaVarsSorted
  if (length(zero_spend_channel) > 0 && !quiet) {
    message("Media variables with 0 spending during date range: ", v2t(zero_spend_channel))
    # hist_carryover[zero_spend_channel] <- 0
  }

Now it's the inflation total!

from robyn.

m4x3 commented on June 8, 2024

So, I just ran the Robyn default Script Robyn_facebook.r.

I selected model id: select_model <- "1_75_15"

I then ran this:

Spend3 <- 10
Response3 <- robyn_response(
  InputCollect = InputCollect,
  OutputCollect = OutputCollect,
  select_model = select_model,
  metric_name = "facebook_S",
  metric_value = Spend3, # total budget for date_range
  date_range = "last_10" # last 10 periods
)

This shows:
10 € spend over 10 days results in input_immediate of 1 € per day.

input_carryover is still mainly affected by the spend from before the selected 10-day period, but it converges towards a very low level driven by the low input_immediate values.

Here is my version:

from robyn.

gufengzhou commented on June 8, 2024

This shows:
10 € spend over 10 days results in input_immediate of 1 € per day.

If you're using the response function separately, you should input 100$ as total budget for the date range period, as the function describes.

The integration in the allocator will prescale the 100$ to the right level before running it.

from robyn.

Crypto1993 commented on June 8, 2024

@gufengzhou difference shows when you have wider period ranges (we used last_52) with this implementation you achieve less dramatic differences when optimizing (less corner solutions).

from robyn.

m4x3 commented on June 8, 2024

If you're using the response function separately, you should input 100$ as total budget for the date range period, as the function describes.

The integration in the allocator will prescale the 100$ to the right level before running it.

@gufengzhou So, I just ran the budget_allocator in debug mode (maximize response for last 10 weeks) and set a browser() just before robyn_response() is called in the budget_allocator:

I then run the robyn_respone() part for i <- 1 which corresponds to facebook_S.

The total spend for the last 10 weeks is: 98195.66 €

initSpendUnit[1] is equal to 9819.566 €

The resp output looks like this:

Again, the input_immediate is now 981.95 € per week just like in the example with 100 €, 10 € and 1 €. I don't see that the value is scaled to the right level anywhere. Am I missing someting?

from robyn.

gufengzhou commented on June 8, 2024

initSpendUnit[1] is equal to 9819.566 €

Thanks for the check. I'll check later and fix it if it's a bug.

from robyn.

gufengzhou commented on June 8, 2024

@m4x3 hey, I just look into it and it looks fine.initSpendUnit[1] is only the initial spend for 1 channel, if you do sum(initSpendUnit), you'll get the weekly total for all channels.

total spend for the last 10 weeks is: 98195.66 €

Also for this, the 98k is for all channels for 10 weeks. When looking at resp you only look at 1 media var at a time

from robyn.

gufengzhou commented on June 8, 2024

Feel free to reopen if there're more questions.

from robyn.

m4x3 commented on June 8, 2024

Hi @gufengzhou ! Thanks for checking this again.

Maybe my wording was not 100% precise: I meant that the total spend for facebook_S amounts to 98195.66 € for the last 10 weeks. --> see screenshot:

I am therefore still not convinced that this is not a bug:
In my example, initSpendUnit[1] is the initial spend for facebook_S and it amounts to 9819.566 €. This makes sense, it is the average spend for facebook_S across the last 10 weeks.
However, now you are calling robyn_response() with this argument.
robyn_response() now simulates the response and the avg. carryover effect from spending a total of 9819.566 € over 10 weeks (because there is also a date_range argument in robyn_response()). But this is not what happened in the last 10 weeks for facebook_S. In reality you spend 10 times the amount.

from robyn.

Robyn Allocator underestimates Carryover about robyn HOT 23 CLOSED

Comments (23)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent