Skip to content

Serverless web scraper for real-time covid-19 data in Nigeria

Notifications You must be signed in to change notification settings

lotaibe/covid-tracker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

NCDC Covid tracker

Serverless web scraper for real-time covid-19 data in Nigeria.

  • Data gotten from NCDC's official website
  • Web scraper built with python using BeautifulSoup and requests modules
  • Scheduled scraping with AWS lambdas and AWS CloudWatch

Requirements

You can install requirements using the pip package manager by running

pip3 install datetime beautifulsoup4 requests lxml

Command line usage

To manually scrape the data from NCDC, run

python3 naijacovidscraper.py

AWS Setup

  1. Create an S3 bucket.
  2. Create AWS LambdaExecute policy to access S3 bucket.
  3. Create a new AWS Lambda and upload zipped python script (with dependencies)
  4. Create a Lambda function (see lambda_function.py) and add layer in Step 4.
  5. Create new Event/rule using AWS CloudWatch

cron expression for 12 hourly schedule: 0 */12 * * ? *
see jsontodf.py to convert JSON files to pandas Dataframes

About

Serverless web scraper for real-time covid-19 data in Nigeria

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages