Applying machine learning models to predict loan outcomes
LendingClub is a US peer-to-peer lending company and the world's largest peer-to-peer lending platform. In this project, I build machine learning models to predict the probability that a loan on LendingClub will charge off (default). These models could help LendingClub investors make better-informed investment decisions.
I used features in the model based on their significance. The majority of the work includes cleaning the data! The modeling process takes several steps, including: removing loan features with significant missing data, or that aren't known to investors; exploring, transforming, and visualizing the data; creating dummy variables for categorical features; and fitting three models: logistic regression, random forest, and k-nearest neighbors.
The best is sometimes simple! And in this case, logistic regression seemed to perform the best.