Skip to content

Utilizing K-means clustering and PCA to group customers based on demographic and financial attributes, optimizing marketing strategies and business decisions."

Notifications You must be signed in to change notification settings

teche74/Customer_Segmentation_USA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Customer Segmentation of USA : Understanding K-Means and PCA

image

Description

Customer Segmentation of USA is a data analysis project that focuses on utilizing K-means clustering and PCA (Principal Component Analysis) to group customers based on demographic and financial attributes. The main objective is to gain insights into different customer segments, allowing businesses to tailor marketing strategies, improve customer satisfaction, and make data-driven business decisions.

Table of Contents

  • Data Source
  • Methods Used
  • Project Structure
  • Installation
  • Usage
  • Results
  • Hyperparameter Tuning
  • Contact

Data Source

The dataset used in this project contains customer information, such as age, education level, income, and other relevant attributes. The data is collected from various sources and is preprocessed before applying clustering and PCA algorithms.

Methods Used

  • Data Preprocessing: Handling missing values, converting categorical variables to numerical representations.

  • K-means Clustering: Grouping customers into distinct clusters based on their attributes.

  • PCA (Principal Component Analysis): Reducing the dimensionality of the data to visualize clusters in a lower-dimensional space.

  • Evaluation Metrics: Using silhouette score and other metrics to assess cluster quality and determine the optimal number of clusters.

Project Structure

The project is organized into the following folders:

  • us_data: Contains the dataset used for the analysis.

  • Segmentation.ipynb: Jupyter notebooks for data preprocessing, clustering, and PCA.

Installation

  1. Clone the repository: git clone https://github.com/teche74/Customer_Segmentation_USA.git

  2. Execute the notebooks or scripts to perform data preprocessing, clustering, and visualization.

Usage

  1. Access the dataset .

  2. Follow the installation instructions to set up the environment and dependencies.

  3. Run the notebooks or scripts to perform data preprocessing, clustering, and PCA.

Hyperparameter Tuning

We use the silhouette score and other evaluation metrics to find the optimal number of clusters for the K-means algorithm. The process involves iteratively evaluating clustering performance for different cluster numbers.

Contact

For any questions or inquiries, please contact [email protected].

About

Utilizing K-means clustering and PCA to group customers based on demographic and financial attributes, optimizing marketing strategies and business decisions."

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published