In this project, I explore the sales performance of Apple products across various retail stores worldwide. The dataset includes detailed information on products, sales transactions, store locations, and warranty claims. Through a series of complex SQL queries, I extract meaningful insights that help understand sales trends, customer behavior, and product performance.
The primary focus of this project is to refine and showcase my SQL skills, solving real-world business problems through data-driven decision-making.
The project uses five main tables:
-
stores: Contains information about Apple retail stores.
store_id
: Unique identifier for each store.store_name
: Name of the store.city
: City where the store is located.country
: Country of the store.
-
category: Holds product category information.
category_id
: Unique identifier for each product category.category_name
: Name of the category.
-
products: Details about Apple products.
product_id
: Unique identifier for each product.product_name
: Name of the product.category_id
: References the category table.launch_date
: Date when the product was launched.price
: Price of the product.
-
sales: Stores sales transactions.
sale_id
: Unique identifier for each sale.sale_date
: Date of the sale.store_id
: References the store table.product_id
: References the product table.quantity
: Number of units sold.
-
warranty: Contains information about warranty claims.
claim_id
: Unique identifier for each warranty claim.claim_date
: Date the claim was made.sale_id
: References the sales table.repair_status
: Status of the warranty claim (e.g., Paid Repaired, Warranty Void).
The project is split into three tiers of questions to test SQL skills of increasing complexity:
- Find the number of stores in each country.
- Calculate the total number of units sold by each store.
- Identify how many sales occurred in December 2023.
- Determine how many stores have never had a warranty claim filed.
- Calculate the percentage of warranty claims marked as "Warranty Void".
- Identify which store had the highest total units sold in the last year.
- Count the number of unique products sold in the last year.
- Find the average price of products in each category.
- How many warranty claims were filed in 2020?
- For each store, identify the best-selling day based on highest quantity sold.
- Identify the least selling product in each country for each year based on total units sold.
- Calculate how many warranty claims were filed within 180 days of a product sale.
- Determine how many warranty claims were filed for products launched in the last two years.
- List the months in the last three years where sales exceeded 5,000 units in the USA.
- Identify the product category with the most warranty claims filed in the last two years.
- Determine the percentage chance of receiving warranty claims after each purchase for each country.
- Analyze the year-by-year growth ratio for each store.
- Calculate the correlation between product price and warranty claims for products sold in the last five years, segmented by price range.
- Identify the store with the highest percentage of "Paid Repaired" claims relative to total claims filed.
- Write a query to calculate the monthly running total of sales for each store over the past four years and compare trends during this period. 21- Analyze product sales trends over time, segmented into key periods: from launch to 6 months, 6-12 months, 12-18 months, and beyond 18 months.
This project primarily focuses on developing and showcasing my following SQL skills:
- Complex Joins and Aggregations: Demonstrating the ability to perform complex SQL joins and aggregate data meaningfully.
- Window Functions: Using advanced window functions for running totals, growth analysis, and time-based queries.
- Data Segmentation: Analyzing data across different time frames to gain insights into product performance.
- Correlation Analysis: Applying SQL functions to determine relationships between variables, such as product price and warranty claims.
- Real-World Problem Solving: Answering business-related questions that reflect real-world scenarios faced by data analysts.
- Size: 1 million+ rows of sales data.
- Period Covered: The data spans multiple years, allowing for long-term trend analysis.
- Geographical Coverage: Sales data from Apple stores across various countries.
Completing this project has significantly enhanced my SQL querying capabilities, especially when dealing with large datasets. It has also given me hands-on experience in solving complex business analysis problems, making this project a strong addition to my portfolio. I am excited to share my work and insights with the data analysis community.