This project analyzes customer shopping behavior using transactional data (~3,900 records) across multiple product categories. The goal is to uncover patterns in customer spending, preferences, and engagement to support better business decisions.
- Total Records: 3,900
- Features: 18
-
Customer Info: Age, Gender, Location, Subscription Status
-
Purchase Details: Item, Category, Purchase Amount, Season, Size, Color
-
Behavior Metrics: Discount Usage, Promo Codes, Purchase Frequency, Previous Purchases, Ratings, Shipping Type
-
Missing values found in the Review Rating column
- Imported the dataset using pandas
- Performed initial exploration using
df.info()anddf.describe() - Handled missing values using median imputation
- Standardized column names into snake_case
- Created new features:
age_grouppurchase_frequency_days
- Removed redundant columns
- Loaded cleaned data into PostgreSQL
Key business questions answered:
- Revenue by Gender
- High-Spending Discount Users
- Top 5 Products by Rating
- Shipping Type Comparison
- Subscribers vs Non-Subscribers
- Discount-Dependent Products
- Customer Segmentation (New, Returning, Loyal)
- Top Products per Category
- Repeat Buyers vs Subscription Behavior
- Revenue by Age Group
An interactive dashboard was built to visualize major insights from the dataset.
- Total customers and average purchase amount
- Revenue by category
- Subscription status distribution
- Sales by age group
- Revenue by age group
Add your Power BI dashboard screenshot here
- Loyal customers contributed the largest share of total purchases
- Customers using discounts still showed strong spending behavior
- A few product categories generated most of the revenue
- Subscription status had a visible impact on customer behavior
- Promote subscription plans with exclusive offers
- Build loyalty programs for repeat buyers
- Review discount strategies to balance profit and sales
- Highlight top-selling and top-rated products in campaigns
- Use targeted marketing for high-value customer groups
- Python (Pandas, NumPy)
- PostgreSQL
- Power BI
├── data/
├── notebooks/
├── sql/
├── dashboard/
└── README.md