Giter VIP home page Giter VIP logo

james-kuo / bayesian-robust-regression Goto Github PK

View Code? Open in Web Editor NEW
0.0 0.0 1.0 63.24 MB

Applied analysis on the Bayesian student-t "Robust" regression model with Jeffrey's prior. Compared its model performance and robustness of posterior distributions with the Gaussian model when outliers are present.

R 100.00%
bayesian-statistics bayesian bayesian-inference robust-regresssion monte-carlo-simulation markov-chain-monte-carlo gibbs-sampling student-t-regresssion outliers jeffreys-prior

bayesian-robust-regression's Introduction

I implemented Gibbs Sampler in R to fit Bayesian student-t “Robust” regression with Jeffrey’s prior on a dataset and compared it with Gaussian linear regression with Jeffrey’s prior. Bayesian student-t regression uses the more fat-tailed t-distribution as sampling distribution to adress the outlier problem in Gaussian regression. Artificial noises are added to a dataset under different settings to compare the posterior distributions of parameters of the student-t and Gaussian model. Model performances and robustness of posterior distributions are compared.

Ting_Yuan_Kuo_Final_Project.pdf is the paper.

Analysis.R, Analysis2.R, and Analysis3.R are R codes which support the main Final Project.rmd markdown file.

AnalysisMain.RData, Corrupt.RData, Corrupt2.RData, and Corrupt2alt.RData contains the processed datasets.

Some Pictures

I compared the robust and gaussian models on a dataset with only 1 independent variable so we can visualize.

Left panel is when the quadratic term is added in addition to the linear term, and right panel is when only the linear term is present. Lower theta value corresponds to a larger artificial outlier added to the dataset [1]. We can see that in the robust model, regression lines are much more stable no matter how large the outlier compared to the gaussian regression. Robust vs. Gaussian Model with Exponential noise

Posterior Distributions

Shown below are the posterior distributions of the regression coefficient when only linear term is present. We can see as the outlier becomes "larger", posterior distribution of the robust model does not change by much, whereas posterior distribution of the gaussian model becomes very noisy. Posterior Distributions of Robust vs. Gaussian Model

  1. By larger outlier I mean a larger exponentially distributed shock is added to high-leverage points in the data. theta is the rate parameter.

bayesian-robust-regression's People

Contributors

james-kuo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.