Exploration and Analysis of the data of confirmed deaths in the subcontinet countries using data provided by Our World in Data using T-SQL and MS SQL Server.
Explore the docs »
View Demo
·
Report Bug
·
View Queries
The project is mainly about showcasing the data exploration, manipulation and analytical capabilities using SQL. Following SQL skills are used for writing queries for the exploration and analysis of provided COVID19 dataset specifically for Subcontinent countries.
SQL Skills
Joins
| CTE's
| Temp Tables
| Windows Functions
| Aggregate Functions
| Creating Views
| STORED PROCEDURES
| TYPE CASTING
The data used in this project is publically available at this link and can be downloaded in csv format.
The list of countries included in the analysis is given below:
- India
- Bhutan
- Maldives
- Sri Lanka
- Nepal
- Bangladesh
- Pakistan
To get the database on your system and test and run the queries, you will need to have SQL Server and SSMS installed on your system. Then you can follow these steps:
- Import the Excel files in the Excel Data folder into the SSMS using the process mentioned here
- Open the
Subcontinent Covid19 Analysis Queries.sql
file in SSMS and run the queries in the given sequence.
- Microsoft SQL Server 2019 (RTM) - 15.0.2000.5 (X64)
- SQL Server Management Studio v18.12.1
Max Death Percentage for each Subcontinent Country
Result
Query
SELECT location as 'Location',MAX(total_cases) as 'TotalCases', MAX(total_deaths) as 'TotalDeaths', MAX(ROUND(((total_deaths/total_cases) * 100),2)) as 'MaxDeathPercentage'
FROM CovidDeaths
WHERE location IN ('India', 'Pakistan', 'Bhutan','Maldives','Sri Lanka','Nepal','Bangladesh')
GROUP BY location
ORDER BY 4 DESC;
Max Cases/Infection Ratio for each Subcontinent Country
Result
Query
SELECT location as 'Location', MAX(total_cases) as 'TotalCases', Max(population) as 'TotalPopulation' ,MAX(ROUND(((total_cases/population) * 100),2)) as 'MaxPercentPopulationInfected'
FROM CovidDeaths
WHERE location IN ('India', 'Pakistan', 'Bhutan','Maldives','Sri Lanka','Nepal','Bangladesh')
GROUP BY location
ORDER BY 4 DESC;
Comparison Death Ratio Subcontinent vs Rest of Asia
Result
Query
SELECT SUM(TotalCases) as 'TotalCases', SUM(TotalDeaths) as 'TotalDeaths', ROUND(((SUM(TotalDeaths))/SUM(TotalCases))*100,2) as 'DeathPercentageSubContCountries'
FROM (
SELECT MAX(total_cases) AS 'TotalCases', MAX(total_deaths) AS 'TotalDeaths'
FROM CovidDeaths
WHERE location IN ('India', 'Pakistan', 'Bhutan','Maldives','Sri Lanka','Nepal','Bangladesh')
GROUP BY location) x;
SELECT SUM(TotalCases) as 'TotalCases', SUM(TotalDeaths) as 'TotalDeaths', ROUND(((SUM(TotalDeaths))/SUM(TotalCases))*100,2) as 'DeathPercentageAsia'
FROM (
SELECT MAX(total_cases) AS 'TotalCases', MAX(total_deaths) AS 'TotalDeaths'
FROM CovidDeaths
WHERE continent = 'Asia' AND location NOT IN ('India', 'Pakistan', 'Bhutan','Maldives','Sri Lanka','Nepal','Bangladesh')
GROUP BY location) x;
Subcontinent-Covid-Death-Exploration-Sql
├── Data Files
│ ├── CovidDeaths.xlsx
│ └── CovidVaccinations.xlsx
├── README.md
└── Subcontinent Covid19 Analysis Queries.sql
Thanks for your interest in my project. Have a great tip or optimizations that you want to add? Follow the steps:
- Fork the repository and create your branch from main.
- Issue that pull request!
- Always add a README and/or requirements.txt to your added code.
Issues with template? Found a bug? Have a great idea for an addition? Feel free to file an issue.
- Email - [email protected]
- Github - @yousaf530
- LinkedIn - Muhammad Yousaf Saddique
Enjoy!