View Code? Open in Web Editor
NEW
This project forked from yitingtsaii /r-based-projects-2019-2021
Survival Analysis, Hidden Markov Model, Stochastic Models & Copula, Bayesian Variable Selection
r-based-projects's Introduction
1. Biomarker Identification (Survival Analysis)
Goal: Segment cervical cancer patients by biomarkers to promote precision medicine
Problem: Identify prognosis biomarkers for cervical cancer by survival analysis
Methods:
Screen out noisy covariates: correlation heatmap, univariate cox regression, log-rank test
Subset the data by cell types: focusing on squamous cell carcinoma and adenocarcinoma
Find 2~3 optimal cut points for each biomarker: maximally selected rank statistics
Select biomarkers and covariates: stepwise cox regression
2. Material Price Prediction (Hidden Markov Model)
Goal: Grasp the trend of future material prices to improve inventory control plan
Problem: Predict future material prices by Hidden Markov Model (HMM)
Methods for HMM:
Build 3 hidden states: representing the low, medium, and high status
Assign normal distribution to each state: mean = a fixed number or following a linear regression
Estimate parameters: forward-backward algorithm, Viterbi algorithm (using Bayesian approach by RStan
)
3. Bond/Stock Return Simulation (Stocahstic Models & Copula)
Goal: Generate economic scenarios to help determine the optimal declared interest rate
Problem: Simulate future bond return and stock return by stochastic models and Copula
Methods:
Simulate future bond return: Hull-White Model (short rate -> bond price -> bond return)
Simulate future stock return: Geometric Brownian Motion
Capture the correlation between bond and stock return: copula (Gaussian, t, Archimedean)
4. Bayesian Variable Selection & GDP Forecast (EMVS & Regression)
Goal: Construct proper multiple linear regression models to forecast GDP
Problem: Select significant input variables by Expectation Maximization Variable Selection (EMVS)
Methods for EMVS:
Set prior distribution for regression coefficients: a hierarchical "spike-and-slab" Gaussian mixture prior with a binary latent variable to control whether it is a spike or a slab
Extract information from posterior distribution: EM algorithm
Methods for Regression:
Estimate regression coefficients: Ordinary least square (OLS) or Bayesian approach
r-based-projects's People
Contributors