Comments (23)
Our choice of solution has a reason: it's important to separate immediate & carryover as component of total due to the need for more accurate calibration, see here https://medium.com/@gufengzhou/more-precision-in-mmm-experiment-calibration-with-robyn-from-meta-marketing-science-f608841fc6d4
Thus having it consistently in robyn_response as well as in the allocator is important for the interpretation of the result.
There's definitely room for improvement for better interpretable results. Will communicate our solution when ready.
from robyn.
Hi! I share the view from @Crypto1993 that the Carryover Spend is currently undervalued in the budget allocator.
The following example should help illustrate the point:
Assume the budget allocator is run for the last 10 days. A partner has 100 € total spend which is equally spread across the 10 days (i.e. 10 € per day). The avg. carryover spend is 3 €.
Now what happens in the budget allocator?
for (i in seq_along(mediaSpendSorted)) {
resp <- robyn_response(
json_file = json_file,
# robyn_object = robyn_object,
select_build = select_build,
select_model = select_model,
metric_name = mediaSpendSorted[i],
metric_value = initSpendUnit[i],
date_range = date_range,
dt_hyppar = OutputCollect$resultHypParam,
dt_coef = OutputCollect$xDecompAgg,
InputCollect = InputCollect,
OutputCollect = OutputCollect,
quiet = TRUE,
is_allocator = TRUE,
...
)
# val <- sort(resp$response_total)[round(length(resp$response_total) / 2)]
# histSpendUnit[i] <- resp$input_immediate[which(resp$response_total == val)]
hist_carryover[[i]] <- resp$input_carryover
...
mean(hist_carryover[[i]])
is later used in all of the optimization functions to account for carryover adstock
robyn_response()
is run for the selected date_range
(10 days) with initSpendUnit[i]
(10 €) as input for the matric_value
.
In this case, robyn_response
simulates the expected response when spending a total amount of 10 € over 10 days which would amount to 1 € spend per day. At a spend level of 1 € per day the carryover spend & response is obviously much lower than when spending 10 € per day.
I think of the budget allocator as simulating different spend levels and finding the optimal spend level per partner. So, if we now simulate a spend level of e.g. 12 € per day, we should in my opinion not take the avg. carryover spend from spending 1 € per day, but rather the one from spending 10 € per day. And even that I don't find ideal, because the carryover spend would differ if you spend 50 % more or less than in the previous period.
In our case, we mainly run always-on marketing campaigns. We therefore now try to calculate/simulate the actual avg. carryover spend at a continuous spend level of X € which would be X € * (theta / (1-theta))
and use that logic across the whole budget allocator including the optimization functions. I would be interested in your thoughts on that @gufengzhou
from robyn.
@Crypto1993 I suppose you also changed the robyn_response()
function because the inflation_total
is not in the normal function output.
I just want to add how we changed it.:
fx_objective <- function(x, coeff, alpha, theta, inflexion, x_hist_carryover, get_sum = TRUE) {
# xAdstocked <- x + mean(x_hist_carryover)
xAdstocked <- x + x * (theta / (1-theta)) + 0.01
# Hill transformation
if (get_sum) {
xOut <- coeff * sum((1 + inflexion**alpha / xAdstocked**alpha)**-1)
} else {
xOut <- coeff * ((1 + inflexion**alpha / xAdstocked**alpha)**-1)
}
return(xOut)
}
x * (theta / (1-theta))
is the convergence carryover spend that you would have every day, if you spend x € for a larger number of days. This way the value now only depends on x and theta and not on the historic data. (I added 0.01 because otherwise the optimization wouldn't start for channels with a lower constraint of 0 € spend.) I had to adjust the code in multiple places to also have the theta
values available everywhere.
I guess it's quite similar to using the inflation_rate
as suggested by @Crypto1993
from robyn.
inflation_total[[i]] <- resp$inflation_total
You changed the response function too? Well our implementation takes historical carryover as a base value to add to any desired immediate spend as the adstocked total, while you want to use a multiplier. The intention is the same. Should'nt be any difference if implemented correctly.
from robyn.
Yes, inflation_total
was calculated adstock_transform
I've added in the output of robyn_response
. In your solution did you also changed in the fx_gradient
? Thanks for sharing your solution! it's great that we independently found some similar solution! @m4x3
from robyn.
Hey @m4x3, now I got it and you're right! Thank you very much for pointing it out! The "good news" is that the object resp
is "only" used to simulate the average carryover within the date_range, and the simulated response is done using the fx_objective()
. So the damage was limited, but still it was a bug! Thank you again. Just pushed a fix.
from robyn.
Hi, thanks for your thoughts. We had a different view on this, namely we believe it's about having the "latest" carryover in the budget allocation that should better reflect actionability in business. Having some huge historical carryover from a year a go doesn't feel right.
Besides that, technically speaking, saturation curve always reflects diminishing behaviour on the unit of your data time grain (weekly or daily or monthly, whatever you provide). I need to use a mean spend. histSpendWindow
is sum of spend. Does it make sense?
from robyn.
Hi,
Using HistSpendWindow
to calculate average carryover in the time window used to reallocate budget better reflects where you are on the saturation curve since it takes into account frequency of investment in that timeframe. If you use just initSpendUnit
you are assuming a single investment of average amount for all that timeframe. If you plot saturations curves from Robyn output csv with AdstockedMedia
or AdstockedSpend
with their relative response you can see that on average you are higher in the suturation curve for the same period the allocator is trying to optimize, this leads to have a a lover point of initial spent in allocators one pager curves.
For small time windows this is not a clear problem, but if you try to evaluate how much "space" of optimization there is in ones spent you are overestimating channels effectiveness since you reduced carryover.
One possible solution to use HistSpendWindow
to reflect how much carryover you have on average every week assuming you spent the average amount initSpendWindow
for every period. This works for always on channels o very frequent spent, for infrequent spent it's hard to apply since you have a large distortion due to Jensen's Inequality. This solution has also the problem of keeping fixed the amount of carryover in the budget allocator.
A possible better solution may be to use the inflation_total
as a multiplier of spent instead of average carryover, in this case the carryover becomes dynamic with the solution, but you may have to adjust the gradient function to accomplish full Long term Optimization.
I agree with you that the allocator now it's a short term optimization and it works fine for this use case, but even in this case would be better to use HistSpendWindow
to calculate the average carryover since the results won't be very different.
from robyn.
Currently initSpendUnit = histSpendWindow / nr of periods
, which is within the date_range
. We're aware of the always-on vs. sparse media issue. We came to the conclusion to rather reflect your recent spend level. If channel A has 0 spend in let's say the last 4 periods, we recommend to extend date_range
to a period where this channel was on. Alternatively, you can also define specific dates to explicitly cover it when channel A was on.
from robyn.
Hi! I share the view from @Crypto1993 that the Carryover Spend is currently undervalued in the budget allocator.
The following example should help illustrate the point:
Assume the budget allocator is run for the last 10 days. A partner has 100 € total spend which is equally spread across the 10 days (i.e. 10 € per day). The avg. carryover spend is 3 €.
Now what happens in the budget allocator?
for (i in seq_along(mediaSpendSorted)) { resp <- robyn_response( json_file = json_file, # robyn_object = robyn_object, select_build = select_build, select_model = select_model, metric_name = mediaSpendSorted[i], metric_value = initSpendUnit[i], date_range = date_range, dt_hyppar = OutputCollect$resultHypParam, dt_coef = OutputCollect$xDecompAgg, InputCollect = InputCollect, OutputCollect = OutputCollect, quiet = TRUE, is_allocator = TRUE, ... ) # val <- sort(resp$response_total)[round(length(resp$response_total) / 2)] # histSpendUnit[i] <- resp$input_immediate[which(resp$response_total == val)] hist_carryover[[i]] <- resp$input_carryover ...
mean(hist_carryover[[i]])
is later used in all of the optimization functions to account for carryover adstock
robyn_response()
is run for the selecteddate_range
(10 days) withinitSpendUnit[i]
(10 €) as input for thematric_value
.In this case,
robyn_response
simulates the expected response when spending a total amount of 10 € over 10 days which would amount to 1 € spend per day. At a spend level of 1 € per day the carryover spend & response is obviously much lower than when spending 10 € per day.I think of the budget allocator as simulating different spend levels and finding the optimal spend level per partner. So, if we now simulate a spend level of e.g. 12 € per day, we should in my opinion not take the avg. carryover spend from spending 1 € per day, but rather the one from spending 10 € per day. And even that I don't find ideal, because the carryover spend would differ if you spend 50 % more or less than in the previous period.
In our case, we mainly run always-on marketing campaigns. We therefore now try to calculate/simulate the actual avg. carryover spend at a continuous spend level of X € which would be
X € * (theta / (1-theta))
and use that logic across the whole budget allocator including the optimization functions. I would be interested in your thoughts on that @gufengzhou
Hi! We modified the budget allocator for the same reasons (we had a lot of always on channels). You can find our solution in my repo Robyn-exposure in the branch called "allocator-experiment".
We modified the gradient and the evaluation function to take into account the inflation_factor (the ratio between adstocked media e media) for every unit spent.
from robyn.
In this case,
robyn_response
simulates the expected response when spending a total amount of 10 € over 10 days which would amount to 1 € spend per day.
Have you tested that 10€ for 10 days is really the case? If yes, then it's definitely a bug. Please let me know and also please make sure you're using the latest version. Thanks! @m4x3
from robyn.
We modified the gradient and the evaluation function to take into account the inflation_factor (the ratio between adstocked media e media) for every unit spent.
If this issue is true then it's a bug to be fixed. A while ago Robyn's allocator didn't account for adstock. But as shown above adstock is included already. If you're still using the inflation factor in the gradient you'll be "doubling" the adstocking. @Crypto1993
from robyn.
hi @gufengzhou
I've modified the fx_objective this way:
x_objective <- function(x, coeff, alpha, inflexion, x_hist_carryover, get_sum = TRUE, mm_lm_coefs = NULL) {
#Apply Michaelis Menten model to scale spend to exposure
xScaled <- x * mm_lm_coefs
# Adstock scales
xAdstocked <- xScaled * x_hist_carryover # + mean(x_hist_carryover)
# Hill transformation
if (get_sum) {
xOut <- coeff * sum((1 + inflexion**alpha / xAdstocked**alpha)**-1)
} else {
xOut <- coeff * ((1 + inflexion**alpha / xAdstocked**alpha)**-1)
}
return(xOut)
}
and the gradient in this way:
fx_gradient <- function(x, coeff, alpha, inflexion, x_hist_carryover,
mm_lm_coefs = NULL
) {
# Apply Michaelis Menten model to scale spend to exposure
xScaled <- x * mm_lm_coefs
# Adstock scales
xAdstocked <- xScaled * x_hist_carryover # + mean(x_hist_carryover)
xOut <- -coeff * mm_lm_coefs * x_hist_carryover * sum((alpha * (inflexion**alpha) * (xAdstocked**(alpha - 1))) / (xAdstocked**alpha + inflexion**alpha)**2)
return(xOut)
}
where the inflation_factor is the variable that you see as "x_hist_carryover" (I've changed in the code above)
from robyn.
Are you using exposure modelling? This is what we advised against esp in the budget allocator. But you probably know what you're doing:)
x_hist_carryover is not a multiplier, it's a vector of historical carryover values. If you haven't changed the calculation of x_hist_carryover for your purpose, I'm afraid it doesn't work this way
from robyn.
Yes I've changed the historical carryover in section of code:
# Response values based on date range -> mean spend
initResponseUnit <- NULL
initResponseMargUnit <- NULL
hist_carryover <- list()
inflation_total <- list()
for (i in seq_along(mediaVarsSorted)) {
resp <- robyn_response(
json_file = json_file,
robyn_object = robyn_object,
select_build = select_build,
select_model = select_model,
metric_name = mediaVarsSorted[i],
metric_value = histMediaVarsWindow[i],
date_range = date_range,
dt_hyppar = OutputCollect$resultHypParam,
dt_coef = OutputCollect$xDecompAgg,
InputCollect = InputCollect,
OutputCollect = OutputCollect,
quiet = TRUE,
is_allocator = TRUE,
...
)
# val <- sort(resp$response_total)[round(length(resp$response_total) / 2)]
# histSpendUnit[i] <- resp$input_immediate[which(resp$response_total == val)]
hist_carryover[[i]] <- resp$input_carryover
inflation_total[[i]] <- resp$inflation_total
# get simulated response
resp_simulate <- fx_objective(
x = initSpendUnit[i],
coeff = coefs_sorted[[mediaVarsSorted[i]]],
alpha = alphas[[paste0(mediaVarsSorted[i], "_alphas")]],
inflexion = inflexions[[paste0(mediaVarsSorted[i], "_gammas")]],
x_hist_carryover = resp$inflation_total,
mm_lm_coefs = mm_lm_coefs[i],
get_sum = FALSE
)
resp_simulate_plus1 <- fx_objective(
x = initSpendUnit[i] + 1,
coeff = coefs_sorted[[mediaVarsSorted[i]]],
alpha = alphas[[paste0(mediaVarsSorted[i], "_alphas")]],
inflexion = inflexions[[paste0(mediaVarsSorted[i], "_gammas")]],
x_hist_carryover = resp$inflation_total,
mm_lm_coefs = mm_lm_coefs[i],
get_sum = FALSE
)
names(hist_carryover[[i]]) <- resp$date
initResponseUnit <- c(initResponseUnit, resp_simulate)
initResponseMargUnit <- c(initResponseMargUnit, resp_simulate_plus1 - resp_simulate)
}
names(initResponseUnit) <- names(hist_carryover) <- names(inflation_total) <- mediaVarsSorted
if (length(zero_spend_channel) > 0 && !quiet) {
message("Media variables with 0 spending during date range: ", v2t(zero_spend_channel))
# hist_carryover[zero_spend_channel] <- 0
}
Now it's the inflation total!
from robyn.
So, I just ran the Robyn default Script Robyn_facebook.r
.
I selected model id: select_model <- "1_75_15"
I then ran this:
Spend3 <- 10
Response3 <- robyn_response(
InputCollect = InputCollect,
OutputCollect = OutputCollect,
select_model = select_model,
metric_name = "facebook_S",
metric_value = Spend3, # total budget for date_range
date_range = "last_10" # last 10 periods
)
This shows:
10 € spend over 10 days results in input_immediate
of 1 € per day.
input_carryover
is still mainly affected by the spend from before the selected 10-day period, but it converges towards a very low level driven by the low input_immediate
values.
from robyn.
This shows:
10 € spend over 10 days results in input_immediate of 1 € per day.
If you're using the response function separately, you should input 100$ as total budget for the date range period, as the function describes.
The integration in the allocator will prescale the 100$ to the right level before running it.
from robyn.
@gufengzhou difference shows when you have wider period ranges (we used last_52) with this implementation you achieve less dramatic differences when optimizing (less corner solutions).
from robyn.
If you're using the response function separately, you should input 100$ as total budget for the date range period, as the function describes.
The integration in the allocator will prescale the 100$ to the right level before running it.
@gufengzhou So, I just ran the budget_allocator in debug mode (maximize response for last 10 weeks) and set a browser() just before robyn_response() is called in the budget_allocator:
I then run the robyn_respone()
part for i <- 1
which corresponds to facebook_S
.
The total spend for the last 10 weeks is: 98195.66 €
initSpendUnit[1]
is equal to 9819.566 €
The resp
output looks like this:
Again, the input_immediate is now 981.95 € per week just like in the example with 100 €, 10 € and 1 €. I don't see that the value is scaled to the right level anywhere. Am I missing someting?
from robyn.
initSpendUnit[1] is equal to 9819.566 €
Thanks for the check. I'll check later and fix it if it's a bug.
from robyn.
@m4x3 hey, I just look into it and it looks fine.initSpendUnit[1]
is only the initial spend for 1 channel, if you do sum(initSpendUnit)
, you'll get the weekly total for all channels.
total spend for the last 10 weeks is: 98195.66 €
Also for this, the 98k is for all channels for 10 weeks. When looking at resp
you only look at 1 media var at a time
from robyn.
Feel free to reopen if there're more questions.
from robyn.
Hi @gufengzhou ! Thanks for checking this again.
Maybe my wording was not 100% precise: I meant that the total spend for facebook_S amounts to 98195.66 € for the last 10 weeks. --> see screenshot:
I am therefore still not convinced that this is not a bug:
In my example, initSpendUnit[1]
is the initial spend for facebook_S and it amounts to 9819.566 €. This makes sense, it is the average spend for facebook_S across the last 10 weeks.
However, now you are calling robyn_response()
with this argument.
robyn_response()
now simulates the response and the avg. carryover effect from spending a total of 9819.566 € over 10 weeks (because there is also a date_range argument in robyn_response()
). But this is not what happened in the last 10 weeks for facebook_S. In reality you spend 10 times the amount.
from robyn.
Related Issues (20)
- Plot Weibull PDF Adstock and get the peak value time point HOT 2
- wrong notebook links in robyn_python_notebook.ipynb HOT 2
- Using nevergrad for constraining few of the model coefficients to be positive only in ridge regression HOT 1
- Robyn API via Python plumber didn't work when no organic_vars or factor_vars specifty HOT 3
- AttributeError: '_DE' object has no attribute 'set_objective_weights' HOT 2
- hierarchical MMM HOT 1
- Budget Allocator incorporating JSON file HOT 4
- error writing to connection HOT 2
- Robyn future budget allocation feature status HOT 2
- Demo python notebook crashes when using calibration. HOT 3
- Error in robyn_allocator HOT 4
- Total Budget Optimizing Result has incorrect initial total response HOT 3
- Error in robyn_recreate() with a model pre-trained HOT 2
- Issue creating one pagers either through robyn_outputs or onepager() HOT 2
- Hill function never saturates HOT 1
- Robyn Python API - Internal Server Error 500 | Unable to track the logs HOT 5
- Robyn_API is not defined HOT 3
- Unable to achieve Target ROAS with Python Robyn API Allocator Function - target_efficiency HOT 8
- Calculate measurement errors, handling missing data along with outliers HOT 1
- Share of Spend and Share of Effect definitions HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from robyn.