Giter VIP home page Giter VIP logo

credit_risk_analysis's Introduction

Credit_Risk_Analysis

Tools Used
  • VSCode 1.78
  • Python
    • pandas
    • scikitlearn
    • numpy
    • imblearn

Overview

To predict the risk of loan defaults using machine learning techniques. The data provided includes results from five different sampling techniques, The performance of the models is measured using balanced accuracy score, precision, and recall.

The purpose of this analysis is to assess different machine learning models for credit risk prediction. By evaluating the performance of various models, we can determine their effectiveness in identifying high-risk loans and low-risk loans. This analysis will provide insights into the strengths and weaknesses of the four different sampling models; Random Oversampling, Cluster Centroid Undersampling, SMOTE Oversampling, SMOTEENN Combination Sampling, and two classifier models; Balanced Random Forest Classifier, and Easy Ensemble Classifier.

We will look at metrics including, the balanced accuracy score, precision, and recall. These metrics will allow us to make an informed decision on which model will best perform a credit risk analysis.

Results

Random Over Sampler SMOTE Oversampling Cluster Centroids SMOTEENN Sampling Balanced Random Forest Easy Ensemble Classifier
Accuracy Score 0.6640 0.6556 0.5455 0.6424 0.7885 0.9317
Confusion Matrix
(True/False)
[[72, 29],
[6582, 10522]]
[[64, 37],
[5514, 11590]]
[[67, 34],
[9791, 7313]]
[[71, 30],
[7154, 9950]]
[[71, 30],
[2153, 14951]]
[[93, 8],
[983, 16121]]
Precision [0.99]- average
[0.01]- high risk
[1.00]- low risk
[0.99]- average
[0.01]- high risk
[1.00]- low risk
[0.99]-average
[0.01]- high risk
[1.00]- low risk
[0.99]- average
[0.01]- high risk
[1.00]- low risk
[0.99]- average
[0.03]- high risk
[1.00]- low risk
[0.99]- average
[0.09]- high risk
[1.00]- low risk
Recall [0.62]- average
[0.71]- high risk
[0.62]- low risk
[0.68]- average
[0.63]- high risk
[0.68]- low risk
[0.42]- average
[0.66]- high risk
[0.43]- low risk
[0.58]- average
[0.70]- high risk
[0.58]- low risk
[0.87]- average
[0.70]- high risk
[0.87]- low risk
[0.94]- average
[0.92]- high risk
[0.94]- low risk
F1 Score [0.76]- average
[0.02]- high risk
[0.76]- low risk
[0.80]- average
[0.02]- high risk
[0.81]- low risk
[0.59]- average
[0.01]- high risk
[0.60]- low risk
[0.73]- average
[0.02]- high risk
[0.73]- low risk
[0.93]- average
[0.06]- high risk
[0.93]- low risk
[0.97]- average
[0.16]- high risk
[0.97]- low risk

Summary

There is a wide variety of performance between each model. The Easy Ensemble Classifier and the Balanced Random Forest Classifier are the best-performing models, with the balanced accuracy scores, .9317 and .7885 respectively. Both have F1 scores above .9 indicating that the models are able to predict both positive and negative outcomes more accurately.

The other models, such as Random Oversampling, Cluster Centroid Undersampling, SMOTE Oversampling, and SMOTEENN Combination Sampling, have lower balanced accuracy, precision, and recall scores.

Overall the Easy Ensemble Classifier is best able to predict credit risk. Its ability to combine multple weak learners together allowed it to consistently outperform the other models in terms of accuracy, precision, recall, and F1 score. Its robust performance, balanced predictions, and established effectiveness make it the best recommendation for our credit risk assessment tasks.

credit_risk_analysis's People

Contributors

ljd0 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.