kasaai / quests Goto Github PK

View Code? Open in Web Editor NEW

10.0 6.0 3.0 287 KB

Adventures in research at the intersection of insurance and AI

Home Page: https://quests.kasa.ai

HTML 100.00%

actuarial-science insurance rstats

quests's Introduction

Quests

Repository for quests.kasa.ai which maintains a listing of current projects.

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

quests's People

Contributors

Stargazers

Watchers

Forkers

ryanbthomas strategist922 jimsforks

quests's Issues

Satellite imagery

Automatically estimate damages from roof pix

There is quite a bit of information on P&C ratemaking available, e.g. the Werner and Modlin text and the CAS monograph on GLM; however, there doesn't seem to be a tutorial for doing a ratemaking project end-to-end with R. The goal of this proposed project is to produce a tutorial or series of blog posts that emulates what actuaries do (at least the technical bits) in a typical pricing project, from preparing raw data to filing.

This work will be conducted in a public GitHub repo and open to contributions throughout.

Currently recruiting

Actuaries with practical experience in ratemaking

Add document of existing relevant papers

Machine learning methods for shock lapse modeling

This project investigates using machine learning methods to predict shock lapses at the end of the level term in term life insurance for assumption setting and risk management. Methods from GLM to deep neural networks will be benchmarked. Recruiting is complete and work is underway. The repository will be made public when the first preprint is available in January.

Evaluation of outlets

This proposed task would evaluate the actuarial and insurance journals and make recommendations as to where to send papers. Things to consider may include

Rankings by tenure committees, as we want to encourage early career researchers to collaborate,
Open access,
Turnaround time,
Biases toward theoretical vs. empirical studies, and
Reach.

The deliverable should be a document that will live in this repo, and will inform researchers in making tradeoffs. E.g. anecdotally, Variance is read by more practitioners in the US but has a huge backlog of accepted but not yet published papers; MDPI's Risks is open access and has faster turnaround but has less prestige than IME which may take months for each review round, etc.

Looking for someone to take the lead on this.

workers compensation rating

Workers Comp Rating Plans vary by state. Many states use the NCCI rating plan, but others have their own bespoke plans. They all share some common features. My proposal is the to create a unified R interface to these rating plans so that actuaries can

have an easy way to re-rate portfolios of policies given updated rating plan.
have an independent way to validate the implementation of the rating plan in there company's systems.

I started to collect information to do this several years ago, but I didn't get very far: wcratr

Revamp

The README is way out of date and needs a revamp. Maybe this is better off as a simple web page deployed from the repo. Will get something up in the next week.

Interpretability of ML models in pricing

It is widely accepted that modern machine learning (ML) techniques, such as gradient boosted trees and deep neural networks, have better predictive performance than traditional predictive modeling techniques such as generalized linear models (GLM.) However, one major obstacle for moving beyond GLM in pricing is the perceived lack interpretability of the ML techniques. This project will investigate techniques for model interpretability in the context of regulated environments in the US.

The deliverable of this project will include a research paper and potentially contributions to existing open source packages.

Currently recruiting

Pricing actuary with experience in predictive modeling and filing rates in different US jurisdictions.

Loss Triangles from pdf

Create tool (website?) that leverages google and/or microsoft's OCR API to extract loss triangle data out of pdfs. A tool to extract data from loss runs would also be useful, but the geometry of loss runs seen in the wild are much more varied.

P&C reserving tutorial in R

I want to get all of my thoughts out here. This is probably multiple inter-related projects.

Translate typical actuarial reserving workflow to R

Steps:

Data cleaning, reconciliation
Data diagnostics (Large losses, claim reopens)
Data segmentation, aggregation into reserving cells
Creating triangles and selecting development factors
Standard actuarial methods and method weights to estimate ultimate loss
Creating actuarial report using rmarkdown (bookdown?)

Also want to highlight:

Benefits of data science workflow (source control, drake)
Value of separating the model from the view (contrast with spreadsheets)

Extending the Typical reserving workflow

Show how easy it add new methods to this workflow (contrast with spreadsheets)
Triangle Methods (Mack, boostrapping)
Triangle Free Methods (Parodi)
Stochastic Reserving

Simulated data

Use CAS Loss Simulator to create data for reserving analysis. It was clearly created for someone using an Excel front-end. Making a R-first version with possible extension of functionality

Reserving Game Research Project

Create a shiny app which presents user with a reserving exercise.
Ask practitioners to complete exercise of setting point estimate and reserve range
Analyze variation in methodologies and estimates, compare to simulated results (assumption reserving is done on simulated data)
Could also use results of the survey (or just the game itself) to provide learning practice opportunities for junior actuaries. (Useful for actuarial exams?)

Model deployment

This is meant to be pretty general but let's take an underwriting model for a personal auto carrier as an example. The starting point is a trained u/w score model and decision rules on whether to accept/deny/defer to human an auto insurance application. We want to expose this model as a service so apps/websites can hit it.

This is impactful/important because we need to demystify the operationalization process, and data scientists (incl. actuaries) should be familiar with the entire stack even if their day-to-day isn't orchestrating containers.

Initial assumptions/requirements

Couple flavors
- Saved TF model with TF Serving
- Native R model with plumber
Ideally we'd have a solution that's cloud vendor independent
Docker + k8s; this should be scalable to the top insurers
Some basic CICD mechanism
Some basic authentication

Results

Code
Blog post