This repository contains a collection of tools, scripts and projects that focus on analysis and visualisation of football data.
Table of Contents
- About the project
- Prerequisites
- Folder Structure
- Projects
- Scraping salaries data from Salary.com
- Scraping car's data and crawling to specific URLs
- Scraping of transfers data
- Scraping different types of football data from Understat.com
- Scraping movie data from Cineb.com
- Scraping Real-estate data and crawling to Appartement pages
- Scraping amazons data by keywords search
This repository has a collection of web scraping projects. I attempted to scrape many websites in order to cope with various structures and obtain various sorts of data (cars, salary, sports...). Some of these projects feature crawling techniques as well as exploratory data visualization. I'd also like to point out that the web isn't constant, thus the method I approach a specific website scraping now may not be appropriate in the future.
I recommend starting with the notebook that scrapes movie data from Cineb.com since it provides an understanding of how the scraping is done.
The following open source packages are used in this project:
-
Pandas
-
Matplotlib
-
bs4
-
requests
-
csv
-
json
|-- web-scraping-projects
|-- README.md
|-- data-directory
| |-- books_data.csv
| |-- cars.csv
| |-- movies.csv
| |-- real_estate.csv
| |-- salary_data.csv
| |-- transfers_data.csv
|-- notebooks
|-- Amazon.ipynb
|-- Carvago.ipynb
|-- Cineb_movies.ipynb
|-- Real estate.ipynb
|-- Salaries.ipynb
|-- Transfermarkt.ipynb
|-- Understat.ipynb
|-- .ipynb_checkpoints