This project involves loading, cleaning, analyzing, and visualizing data from a Netflix dataset using Python. The goal is to explore the dataset, gain insights, and prepare for potential machine learning tasks.
- Python: for data cleaning and analysis
- Pandas: for handling and cleaning data
- Matplotlib & Seaborn: for data visualization
- Excel: for manual data inspection and initial exploration
data/
: Contains the Netflix datasetscripts/
: Python scripts for data cleaning, analysis, and visualizationnotebooks/
: google colab for interactive analysisresults/
: Graphs and visual outputs from the analysis
- Genres and Trends: Analysis of the most popular genres and content release trends
- Ratings and Audience: Insights into the ratings distribution across different genres and release years
- Regional Contribution: Breakdown of content by region
- Top Contributors: Frequent directors, actors, and content creators on Netflix
To run this project, clone the repository and install the necessary dependencies:
git clone https://github.com/yourusername/netflix-data-analysis.git
cd netflix-data-analysis
pip install -r requirements.txt