๐๐ธ Loan Data Analysis Project
Welcome to the Loan Data Analysis project! This repository contains an exploratory data analysis (EDA) notebook focused on identifying patterns and trends in loan applications.
๐ Dataset Overview
The dataset contains the following key features:
- ๐
Loan_ID - ๐จโ๐ผ
Gender - ๐
Married - ๐ถ
Dependents - ๐
Education - ๐ผ
Self_Employed - ๐ฐ
ApplicantIncome - ๐ค
CoapplicantIncome - ๐ฆ
LoanAmount - โณ
Loan_Amount_Term - ๐งพ
Credit_History - ๐
Property_Area - โ
Loan_Status
๐ฏ Key Objectives
- ๐ Perform exploratory data analysis (EDA)
- ๐ Understand relationships between features
- ๐ง Identify key factors influencing loan approval
๐งฐ Tools & Libraries Used
- ๐ Python
- ๐งฎ NumPy
- ๐ผ Pandas
- ๐ Matplotlib
- ๐ Seaborn
๐งผ Data Cleaning
- โ Handled missing values
- ๐ Encoded categorical variables
- ๐งน Removed outliers using boxplots and replaced them with mean values
๐ Analysis Highlights
๐น Univariate Analysis
- Used histograms, countplots, and pie charts to analyze distributions
- Found most applicants are male, married, and graduates
๐น Bivariate Analysis
- Explored relationships between:
- ๐ณ
Credit_History&Loan_Status - ๐งโ๐
Education&LoanAmount - ๐๏ธ
Property_Area&Loan_Status
- ๐ณ
๐น Income Insights
- ๐ผ Higher Applicant Income usually means larger loans
- ๐ฅ Coapplicant Income has less influence on loan size
๐น Outlier Detection
- Used box plots to identify outliers in
ApplicantIncomeandLoanAmount - Replaced extreme outliers with mean values to normalize the data
๐ Visualizations
- ๐ฆ Boxplots to show income distribution and outliers
- ๐ง Correlation Heatmap to show numeric relationships
- ๐ Scatter plots & Pair plots to reveal hidden patterns
- ๐ฉ Stacked bar charts to visualize class breakdowns by groups
๐ง Key Insights
- โ Credit History is the most important factor for loan approval
- ๐ Graduates are slightly more likely to get loans approved
- ๐งโโ๏ธ Male applicants dominate the dataset
- ๐ Semiurban property areas have higher approval rates
- ๐ Loan amount and income are positively correlated
๐ฎ Future Work
- ๐ค Train machine learning models for loan status prediction
- ๐งฑ Feature engineering for improved performance
- ๐งช Hyperparameter tuning and model evaluation
Suggestion In this project I have analyse that we have to focused on semiurban areas and graduates people for better loan approval .
๐ How to Run
- Clone the repo:
git clone https://github.com/AADARSH028/loan-data-analysis.git