Credit Card Default Prediction Using Machine Learning

Tech Stack: Python, pandas, scikit-learn, Logistic Regression, XGBoost, Gradient Boosting, Random Forest

This project focuses on predicting customers likely to default on credit card payments by leveraging machine learning classification techniques. It aims to help financial institutions manage credit risk more effectively.

Problem Statement

Credit card defaults can cause significant financial losses for banks and lenders. Accurately predicting default risk enables proactive management, reducing losses and improving customer engagement strategies.

Overview

Goal: Predict credit card payment defaults to reduce financial risk
Data: Customer credit history, demographic, and payment behavior features
Models: Logistic Regression, Decision Trees, Random Forest, XGBoost, Gradient Boosting
Best Performance: Random Forest with highest ROC AUC score

Approach

Data Cleaning & Preprocessing: Handled missing values, encoded categorical variables, and scaled features.

Balancing Dataset: Applied SMOTE to address class imbalance and improve model robustness.

Modeling: Built and tuned five classification models, evaluated with ROC AUC, precision, recall, and F1-score.

Evaluation: Selected Random Forest based on ROC AUC; analyzed confusion matrices and feature importance.

Highlights

Implemented SMOTE to overcome class imbalance in credit default data
Compared multiple classification algorithms to find best predictive performance
Enhanced credit risk management and optimized customer collection strategies

🔗 View Code on GitHub