Giter VIP home page Giter VIP logo

Comments (3)

CJ2407 avatar CJ2407 commented on June 2, 2024

A few more explanations to be able to use the outputs properly -

  1. report_alldecomp_matrix and pareto_alldecomp_matrix CSVs have different totals for the refresh solnID. Why? I see all my 396 rows of data for refresh is duplicated in the report_alldecomp_matrix CSV with the same solnID of the refresh.

  2. % decomp from report_aggregated CSV do not match with the report_decomposition plot for the initial model. Why? It looks like report_decomposition plot has % matching with pareto_aggregated from the initial model build.

  3. Initial model pareto_aggregated CSV has different % attribution vs initial model attribution generated in report_aggregated CSV of the Refresh model. Why? Also, what's the time period for which 3_60_31 results would have generated in a refresh run?Refresh model 1_15_10 results are for rolling 1185 days starting 3/3/2020. See below -
    image

  4. I am seeing some very big swings between initial model and refresh model (see in the screenshot above). When most of my hyperparameters are fixed, why would this be? How do I explain or validate them?

from robyn.

gufengzhou avatar gufengzhou commented on June 2, 2024

Sorry for the late reply. The team is lagging in resource at the moment and refresh is quite a complicated feature to fix. Let me address the issues here:

  1. You can export your own plots by looking at the refresh object RobynRefresh$refresh$plots$pBarRF. We've also updated the package so that the background isn't transparent. Please update and try.
  2. Sorry to hear about the missing json. A refresh model will iterate its hyperparameters based on the previous model, which is the whole trick of refreshing: Keeping the balance of consistency vs. sensitivity regarding decomposition. For example, assuming modelling week 1-100 in the selected initial model, and channel A has theta = 0.2 within an original range of c(0.1, 0.4). Then assuming you're building refresh after 4 weeks --> refresh model will be built on week 5-104 (refresh_steps = 4), and theta for channel A will have a new range centering around 0.2 +/- (0.4-0.1) * (4 /100) /2, which results in range of c(0.1994, 0.2006). As you can see, we're allowing narrow ranges for each hyppars in the refresh to ensure consisstancy between refreshes. BUT when you're refreshing larger periods, then the refresh ranges become larger too, leading to more diversed results. It's a fine line between refresh and rebuild. Usually we encourage model refresh when the steps are small. For larger periods and/or major changes like new media channels/ new variables etc, we recommend rebuilding the model.

from robyn.

gufengzhou avatar gufengzhou commented on June 2, 2024

To your questions 3-6 that are all about inconsisstancy across initial & refresh models, I just did a quick 4-step refresh test with the sample data. In the first screenshot, the refresh pareto_aggregated.csv on the right side (the recreated model) has slightly different coefficient than the original/initial csv.

image

I've looked into it and this is actually due to the rounding when exporting the json files. We can definitely fix that to allow more precise model recreation. But we're not off too much here, as you can see here below.
image

In your case, the difference is of course very large and I assume it's due to missing hyperparameter values in your model recreation process: besides the "usual" media hyperparameters (theta/alpha/gamma/shape/scale), there's also lambda (and train_size if you use ts_validation). In the 3rd screenshot below, you can see my tested model json 1_134_6 that has the session hyper_values, incl. lambda and train_size (you can also see the 4-digit rounding here). You can find the selected lambda value in the pareto_hyperparameters.csv too. Both lambda and train_size have strong impact on beta coef estimation, so it's not very surprising that it came out differently. This means, if you also fix lambda/train_size, you should be having close to identical recreated models. But of course the best way is to always remember exporting the json :)
image

from robyn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.