This repository contains Python scripts for scraping research data from three leading Indian research institutions:
- Indian Institutes of Science Education and Research (IISER) Mohali
- Indian Institutes of Science Education and Research (IISER) Pune
- Indian Institutes of Science Education and Research (IISER) Kolkata
We analysed the research area and academic background of the faculties to derive some conclusive study (Data and plot for analysis are presented in the webpage and corresponding folders).
The main script, main.py
, serves the following purposes:
-
Data Scraping: It scrapes data from each faculties websites of the three institutions, extracting information related to research projects, publications, or any other relevant data.
-
Data Storage: The scraped data is then stored in an SQLite3 database, ensuring a structured and efficient storage mechanism.
-
Data Analysis: The script also includes functionality to analyze the scraped data, allowing you to gain insights or perform various operations on the collected information.
The script is properly commented for readers to easily follow along..
We also built a webpage that we use to explain all the methods we used for our scrapping and analysis.
This folder includes the seperate analysis contents including code, database file, plots for these institutes. IISER Mohali's analysis is done by the main.py itself.
To use the main.py
script, follow these steps:
-
Clone this repository to your local machine:
git clone https://github.com/kaushik-iiserm/institutional-scrapping/