Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Refresh output is not matching the documentation about robyn HOT 3 OPEN

CJ2407 commented on June 2, 2024

Refresh output is not matching the documentation

from robyn.

Comments (3)

CJ2407 commented on June 2, 2024

A few more explanations to be able to use the outputs properly -

report_alldecomp_matrix and pareto_alldecomp_matrix CSVs have different totals for the refresh solnID. Why? I see all my 396 rows of data for refresh is duplicated in the report_alldecomp_matrix CSV with the same solnID of the refresh.
% decomp from report_aggregated CSV do not match with the report_decomposition plot for the initial model. Why? It looks like report_decomposition plot has % matching with pareto_aggregated from the initial model build.
Initial model pareto_aggregated CSV has different % attribution vs initial model attribution generated in report_aggregated CSV of the Refresh model. Why? Also, what's the time period for which 3_60_31 results would have generated in a refresh run?Refresh model 1_15_10 results are for rolling 1185 days starting 3/3/2020. See below -
I am seeing some very big swings between initial model and refresh model (see in the screenshot above). When most of my hyperparameters are fixed, why would this be? How do I explain or validate them?

from robyn.

gufengzhou commented on June 2, 2024

Sorry for the late reply. The team is lagging in resource at the moment and refresh is quite a complicated feature to fix. Let me address the issues here:

You can export your own plots by looking at the refresh object RobynRefresh$refresh$plots$pBarRF. We've also updated the package so that the background isn't transparent. Please update and try.
Sorry to hear about the missing json. A refresh model will iterate its hyperparameters based on the previous model, which is the whole trick of refreshing: Keeping the balance of consistency vs. sensitivity regarding decomposition. For example, assuming modelling week 1-100 in the selected initial model, and channel A has theta = 0.2 within an original range of c(0.1, 0.4). Then assuming you're building refresh after 4 weeks --> refresh model will be built on week 5-104 (refresh_steps = 4), and theta for channel A will have a new range centering around 0.2 +/- (0.4-0.1) * (4 /100) /2, which results in range of c(0.1994, 0.2006). As you can see, we're allowing narrow ranges for each hyppars in the refresh to ensure consisstancy between refreshes. BUT when you're refreshing larger periods, then the refresh ranges become larger too, leading to more diversed results. It's a fine line between refresh and rebuild. Usually we encourage model refresh when the steps are small. For larger periods and/or major changes like new media channels/ new variables etc, we recommend rebuilding the model.

from robyn.

gufengzhou commented on June 2, 2024

To your questions 3-6 that are all about inconsisstancy across initial & refresh models, I just did a quick 4-step refresh test with the sample data. In the first screenshot, the refresh pareto_aggregated.csv on the right side (the recreated model) has slightly different coefficient than the original/initial csv.

I've looked into it and this is actually due to the rounding when exporting the json files. We can definitely fix that to allow more precise model recreation. But we're not off too much here, as you can see here below.

In your case, the difference is of course very large and I assume it's due to missing hyperparameter values in your model recreation process: besides the "usual" media hyperparameters (theta/alpha/gamma/shape/scale), there's also lambda (and train_size if you use ts_validation). In the 3rd screenshot below, you can see my tested model json 1_134_6 that has the session hyper_values, incl. lambda and train_size (you can also see the 4-digit rounding here). You can find the selected lambda value in the pareto_hyperparameters.csv too. Both lambda and train_size have strong impact on beta coef estimation, so it's not very surprising that it came out differently. This means, if you also fix lambda/train_size, you should be having close to identical recreated models. But of course the best way is to always remember exporting the json :)

from robyn.

Refresh output is not matching the documentation about robyn HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent