Giter VIP home page Giter VIP logo

predictive-modeling-ta--fraud-case-study's Introduction

Predictive Modeling TA- Fraud IEEE case study

Predictive modeling on fraud dataset Those notebooks were created as part of my role as a predictive modeling teaching assistant for master students. In complimentary with the course theory dealing with data science pipeline.

Implementing data science theory into practice using unbalanced fraud data into colab python notebooks. (Full notebooks that I shared with my student found here)

  • Performing exploratory data analysis (EDA) using NumPy, pandas, matplotlib, seaborn, spicy, and ploty in python.

  • Exploring the pros and cons of different methods to handle missing data, outliers, and transformations.

    • Handle missing data: dropping missing data, fill with ‘NaN’ and ‘0’, forward and back-fill, fill with mode and mean, fill nulls by distribution, handling nulls with interpolate.
    • Transformations of the data according to the positivity and the negativity of the distribution’s skew.
    • Removing outliers according to the quantile and kurtosis.
  • Feature selection using correlation and mutual info.

  • Handling categorical features using get dummies.

  • Handling unbalanced data by using SMOTE nested within cross-validation using K-Folds. Balancing positive and negative target data selection for the cross-validation by divided sampling.

  • Applying Logistic Regression machine learning model (Intentionally- for the purpose of exploring the consequence of data handling, a Decision Tree is a better fitted modal for this type of data).

  • Evaluating Accuracy, confusion matrix (precision and recall), AUC (Area under the ROC Curve), and f1-score.

predictive-modeling-ta--fraud-case-study's People

Contributors

dinbav avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.