Giter VIP home page Giter VIP logo

explain-ml-pricing's People

Contributors

daniellupton avatar kevinykuo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

explain-ml-pricing's Issues

US/CAS-centricity in current writeup

While we're targeting Variance, it would be helpful for the broader community to parse the paper if we qualify that exams/ASOPs are for CAS/US. Even better if we can cite some international societies' efforts.

Reconcile rate relativities vs. ML

From the ratemaking section

3. Perform analysis on the data, employing desired method or methods to estimate
needed rate relativites
4. Select final rate relativities based on rate indications
5. Present rates to the reglator, including explanation of the steps followed to derive
the rates
6. Answer questions from regulators regarding the method employed
The focus of this paper is on steps 5 and 6

I don't think we can get at relativities with ML models, so maybe we need to revise this bit to say 4-5 will be different in the world with ML models. It may also make sense to point out somewhere that we'll be using ML as a drop-in replacement for GLM rather than for feature engineering only.

Standardized Set of Questions

This is an open-ended question that I mentioned on our call. Is it realistic to assume that there could be a single standardized set of questions that, properly answered, could reasonably qualify any model?

For example, suppose the question were simply:
"Demonstrate that the rates produced by the model are not inadequate, excessive, or unfairly discriminatory."

That would technically address any concern insofar as a successful answer would mean the model could be approved, but realistically speaking, chances are that it would not produce good regulatory outcomes since modelers wouldn't really know how to answer that.

What I mean to say is that I guess after reading the paper over, I prefer to think of the "question and answer" framework in terms of a set of idealized questions and that a perfect set of questions may not exist that cover every model. My sense has been that no matter how detailed your questions and how varied, you'll probably see some model that leaves you with additional questions from time to time.

That is, I see the question-and-answer thing more as a metaphor for how the actuary should conceptualize the requirement to communicate to intended users of a model than as a suggestion to actually come up with a list of specific questions.

IF you agree - and I suppose that could be a big if - then we may want to consider rewording some of the language around the "standardized set of questions" a bit accordingly.

ML models and "deterministic"

Because many machine learning models are deterministic, they may not admit of standard metrics for model comparison (e.g., it’s not straightforward to calculate an AIC over a neural network).

We'll want to reword this e.g. "does not assume an underlying stochastic process" since deterministic has a different meaning in ML

Error in R code of "explain.R"

When run the R codes:
fi <- ingredients::feature_importance(
explainer_nn,
loss_function = function(observed, predicted, weights) {
sqrt(
sum(((observed - predicted) ^ 2 * weights) /sum(weights))
)
},
weights = testing_data$exposure,
variables = predictors,
)
the result is “Error in loss_function(observed, predict_function(x, sampled_data)) : argument "weights" is missing, with no default”.
The problem should be “loss_function”,but I don’t know why.

Questions to pose and answer

Some candidates

  1. How important is each variable to the model?
  2. Given a policy, how do the different characteristics (age, location, make, etc.) contribute to its predicted loss cost?
  3. How does the predicted loss cost change if we change a variable a bit?

Citation for GLM being SOTA?

Is it accurate to say GLM is state of the art for risk classification today? Are there sources we can/should cite or does it go without saying?

Risk classification for property & casualty (P&C) insurance rating has traditionally been done with one-way, or univariate, analysis techniques. In recent years, many insurers have moved towards using generalized linear models (GLM), a multivariate predictive modeling technique, which addresses many shortcomings of univariate approaches, and is currently considered the state of the art in insurance risk classification. At the same time, machine learning (ML) techniques such as deep neural networks have gained popularity in many industries due to their superior predictive performance over linear models [@lecunDeepLearning2015]. In fact, there is a fast growing body of literatuer on applying ML to P&C reserving [@kuoDeepTriangleDeep2018; @wuthrichMachineLearning2018; @gabrielliNeuralNetwork2019a; @gabrielliNeuralNetwork2019]. However, these ML techniques, often considered to be completely “black box”, have been less successful in gaining adoption in pricing, which is a regulated discipline and requires a certain amount of transparency in models.

@daniellupton?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.