This project focuses on the development of a machine learning model for online payment fraud detection. It includes exploratory data analysis (EDA) and classification using decision tree classifier. The dataset used contains information about various transactions including transaction type, amount, old and new balances, and fraud labels.
- Libraries such as pandas, numpy, plotly, sklearn, matplotlib, and seaborn are imported for data manipulation, visualization, and model training.
- The dataset containing transaction details is loaded.
- Heatmap visualization is used to identify missing values in the dataset.
- Visualization of transaction type distribution.
- Analysis of transaction types and their corresponding amounts.
- Examination of fraudulent transactions distribution.
- Distribution analysis of the 'step' column.
- Joint plot visualization of 'step' and 'amount'.
- Countplot of fraudulent transactions based on transaction type.
- Correlation between features and the 'isFraud' column is checked.
- Categorical features are transformed into numerical values.
- 'isFraud' column values are converted to 'No Fraud' and 'Fraud' labels.
- Data is split into training and testing sets.
- A decision tree classifier is trained using the training data.
- Predictions are made using the trained model for sample features.
The decision tree classifier achieves a high accuracy of 90+ % in detecting fraudulent transactions. The model shows promising results in identifying potential fraud in online payment transactions.