Skip to content

AADARSH028/loan-dataset-eda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“Š๐Ÿ’ธ Loan Data Analysis Project

Welcome to the Loan Data Analysis project! This repository contains an exploratory data analysis (EDA) notebook focused on identifying patterns and trends in loan applications.


๐Ÿ“ Dataset Overview

The dataset contains the following key features:

  • ๐Ÿ†” Loan_ID
  • ๐Ÿ‘จโ€๐Ÿ’ผ Gender
  • ๐Ÿ’ Married
  • ๐Ÿ‘ถ Dependents
  • ๐ŸŽ“ Education
  • ๐Ÿ’ผ Self_Employed
  • ๐Ÿ’ฐ ApplicantIncome
  • ๐Ÿค CoapplicantIncome
  • ๐Ÿฆ LoanAmount
  • โณ Loan_Amount_Term
  • ๐Ÿงพ Credit_History
  • ๐ŸŒ Property_Area
  • โœ… Loan_Status

๐ŸŽฏ Key Objectives

  • ๐Ÿ” Perform exploratory data analysis (EDA)
  • ๐Ÿ“ˆ Understand relationships between features
  • ๐Ÿง  Identify key factors influencing loan approval

๐Ÿงฐ Tools & Libraries Used

  • ๐Ÿ Python
  • ๐Ÿงฎ NumPy
  • ๐Ÿผ Pandas
  • ๐Ÿ“Š Matplotlib
  • ๐ŸŒˆ Seaborn

๐Ÿงผ Data Cleaning

  • โœ… Handled missing values
  • ๐Ÿ”„ Encoded categorical variables
  • ๐Ÿงน Removed outliers using boxplots and replaced them with mean values

๐Ÿ“Š Analysis Highlights

๐Ÿ”น Univariate Analysis

  • Used histograms, countplots, and pie charts to analyze distributions
  • Found most applicants are male, married, and graduates

๐Ÿ”น Bivariate Analysis

  • Explored relationships between:
    • ๐Ÿ’ณ Credit_History & Loan_Status
    • ๐Ÿง‘โ€๐ŸŽ“ Education & LoanAmount
    • ๐Ÿ˜๏ธ Property_Area & Loan_Status

๐Ÿ”น Income Insights

  • ๐Ÿ’ผ Higher Applicant Income usually means larger loans
  • ๐Ÿ‘ฅ Coapplicant Income has less influence on loan size

๐Ÿ”น Outlier Detection

  • Used box plots to identify outliers in ApplicantIncome and LoanAmount
  • Replaced extreme outliers with mean values to normalize the data

๐Ÿ“ˆ Visualizations

  • ๐Ÿ“ฆ Boxplots to show income distribution and outliers
  • ๐ŸงŠ Correlation Heatmap to show numeric relationships
  • ๐Ÿ“‰ Scatter plots & Pair plots to reveal hidden patterns
  • ๐ŸŸฉ Stacked bar charts to visualize class breakdowns by groups

๐Ÿง  Key Insights

  1. โœ… Credit History is the most important factor for loan approval
  2. ๐ŸŽ“ Graduates are slightly more likely to get loans approved
  3. ๐Ÿงโ€โ™‚๏ธ Male applicants dominate the dataset
  4. ๐Ÿ  Semiurban property areas have higher approval rates
  5. ๐Ÿ“Š Loan amount and income are positively correlated

๐Ÿ”ฎ Future Work

  • ๐Ÿค– Train machine learning models for loan status prediction
  • ๐Ÿงฑ Feature engineering for improved performance
  • ๐Ÿงช Hyperparameter tuning and model evaluation

Suggestion In this project I have analyse that we have to focused on semiurban areas and graduates people for better loan approval .

๐Ÿ How to Run

  1. Clone the repo:
    git clone https://github.com/AADARSH028/loan-data-analysis.git

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors