Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WorldBank Datasets Refresh #1151

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
58 changes: 58 additions & 0 deletions scripts/world_bank/datasets/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# World Bank Datasets
- The WorldBankDatasets contains data about multiple databases like World development Indicators,Jobs,Education Statistics
- source: https://data.worldbank.org

- how to download data: Auto download of data by using python script(datasets.py).

- type of place: Country.

- statvars: All Type

- years: 1960 to 2050

- copyright year: 2024

## Processes WB datasets.

update september 2024:
To run all processing methods , please do not pass the mode
Run: python3 datasets.py

Or If required to check issue in any individual process follow all the steps as below:
Supports the following tasks:


## fetching datasets
- fetch_datasets: Fetches WB dataset lists and resources and writes them to 'output/wb-datasets.csv'

Run `python3 datasets.py --mode=fetch_datasets`

## Downloadin the datasets
- download_datasets: Downloads datasets listed in 'output/wb-datasets.csv' to the 'output/downloads' folder.

Run: `python3 datasets.py --mode=download_datasets`

## Writing wb codes
- write_wb_codes: Extracts World Bank indicator codes (and related information) from files downloaded in the 'output/downloads' folder to 'output/wb-codes.csv'.

It only operates on files that are named '*_CSV.zip'.

Run: `python3 datasets.py --mode=write_wb_codes`

## Loads The Stat vars
- load_stat_vars: Loads stat vars from a mapping file specified via the `stat_vars_file` flag.
- Use this for debugging to ensure that the mappings load correctly and fix any errors logged by this operation.

Run: `python3 datasets.py --mode=load_stat_vars --stat_vars_file=/path/to/statvars.csv`

- See `sample-svs.csv` for a sample mappings file.

## Writing output files
- write_observations: Extracts observations from files downloaded in the 'output/downloads' folder and saves them to CSVs in the 'output/observations' folder.

- The stat vars file to be used for mappings should be specified using the `stat_vars_file' flag.

- It only operates on files that are named '*_CSV.zip'.

Run: `python3 datasets.py --mode=write_observations --stat_vars_file=/path/to/statvars.csv`

Loading
Loading