Skip to content

BNIA/VitalSigns

Repository files navigation

Vital Signs

Scripts to create our annual, publicly-available, community-focused datasets; for Baltimore City.

Hi! We are BNIA-JFI.

This package was made to create Vital Signs data.

Check our Github page for more information and tools.

About

  • Functions built and used by BNIA for annual Vital Signs data release.
  • Made to be shared via IPYNB/ Google Colab notebooks.
  • Data may be private and is sometimes public.
  • PyPi libraries created from the notebooks.

Included (but not limited to)

  • CloseCrawl - Pull MD Courts data.
  • TidyAddr - Expertly clean addresses in Baltimore (and beyond). Works Seamlessly with Closecrawl.
  • Download ACS - ACS Tutorial. Gives a function and also teaches you how to pull any data for any geography using this API (can aggregate tracts on along a crosswalk).
  • Create ACS Statistics - Create pre-made statistics from ACS data. Builds off the ACS Downloader
  • VS Indicators - Create other (non ACS) Vital Signs statistics using these pre-made functions.
  • convertvssheetforwpupload - For internal developer use when publishing at BNIA

VitalSigns uses functions found in our Dataplay Module and vice-versa.

Binder Binder Open Source Love svg3

NPM License Active Python Versions GitHub last commit

GitHub stars GitHub watchers GitHub forks GitHub followers

Tweet Twitter Follow

Usage Instructions

Install the Package

The code is on PyPI so you can install the scripts as a python library using the command:

!pip install BNIAJFI-VitalSigns dataplay geopandas

Import Modules

  1. Import the installed module into your code:
from VitalSigns.acsDownload import retrieve_acs_data
  1. use it
retrieve_acs_data(state, county, tract, tableId, year)

Getting Help

You can get information on the package by using the help command.

Here we look at the package's modules:

import VitalSigns
help(VitalSigns)

Lets take a look at what functions the geoms module provides:

import VitalSigns.acsDownload
help(VitalSigns.acsDownload)

And here we can look at an individual function and what it expects:

import VitalSigns.acsDownload
help(VitalSigns.acsDownload.retrieve_acs_data)

Example #1

Follow this process for all VitalSigns scripts. The 'racdiv' script requires one more step, and is shown in example #2

ACS Download

Install the package.

!pip install BNIAJFI-VitalSigns dataplay geopandas

Import your modules.

from VitalSigns.acsDownload import retrieve_acs_data

Read in some data.

#Define our download parameters (tract, county, state, tableId, state, and year)
#Our download function will use Baltimore City's tract, county and state as internal parameters
#Changing these values using different geographic reference codes will change those parameters

tract = '*'
county = '510'
state = '24'

tableId = 'B01001'
year = '21'

And download the Baltimore City ACS data using the imported VitalSigns library.

df = retrieve_acs_data(state, county, tract, tableId, year)
df.head()

Save the ACS data (Use this method ONLY if you are working in Google Colab. Otherwise, you can save the data however you prefer)

from google.colab import files
df.to_csv('YourFileName.csv') 
files.download('YourFileName.csv')

ACS Calculations and Indicators

Now that we have the ACS data, we can use any of the scripts in the VitalSigns library to create the Baltimore City indicators.

These scripts will download and clean ACS data for Baltimore and then construct indicators from the data.

A list of all the tables used and their respective indicator scripts can be found Here

First, import the script(s) you would like to use for the ACS data chosen.

#Script to create the Percent of Population Under 5 Years old indicator.
from VitalSigns.create import createAcsIndicator, age5 

Once the script has been imported, we can now create the Baltimore City indicators.

Note: There are two different crosswalk tables (mergeUrl below) depending on the CSA names you want to use (2010 or 2020). If you use the incorrect crosswalk the output will be incorrect for the CSAs whose tract numbers changed in 2020.

For CSA2010 use - https://raw.githubusercontent.com/BNIA/VitalSigns/main/CSA2010.csv

For CSA2020 use - https://raw.githubusercontent.com/BNIA/VitalSigns/main/CSA2020.csv

mergeUrl = 'https://raw.githubusercontent.com/BNIA/VitalSigns/main/CSA2020.csv'
merge_left_col = 'tract'
merge_right_col= 'TRACT20' #For the 2020 CSAs use 'TRACT20', for 2010 CSAs use 'TRACT10'
merge_how = 'outer'

groupBy = 'CSA2020'     #For the 2020 CSAs use 'CSA2020', for 2010 CSAs use 'CSA2010'

method = age5
aggMethod = 'sum'
columnsToInclude = []


MyIndicator = createAcsIndicator(state, county, tract, year, tableId,
                    mergeUrl, merge_left_col, merge_right_col, merge_how, groupBy,
                    aggMethod, method, columnsToInclude, finalFileName=False)

MyIndicator.head()

Now we can save the Baltimore City indicators (Use this method ONLY if you are working in Google Colab. Otherwise, you can save the data however you prefer)

from google.colab import files
MyIndicator.to_csv('YourIndicatorFileName.csv') 
files.download('YourIndicatorFileName.csv')

Example #2 (racdiv indicator)

The Racial Diversity Index (racdiv) indicator is the only script in our library that relies on two ACS tables. Due to this difference, this is the only script that will ask the user for an input while the script is running (the user needs to re-enter the year)

Lets follow the same process we did during example #1

ACS Download

Install the package.

!pip install BNIAJFI-VitalSigns dataplay geopandas

Import your modules.

from VitalSigns.acsDownload import retrieve_acs_data

Read in some data.

tract = '*'
county = '510'
state = '24'

tableId = 'B02001'
year = '21' #This is the number that the user NEEDS to re-enter once the script asks for an input

And download the Baltimore City ACS data using the imported VitalSigns library.

df = retrieve_acs_data(state, county, tract, tableId, year)
df.head()

Save the ACS data (Use this method ONLY if you are working in Google Colab. Otherwise, you can save the data however you prefer)

from google.colab import files
df.to_csv('YourFileName.csv') 
files.download('YourFileName.csv')

ACS Calculations and Indicators

To see the table IDs and their respective indicators again, click Here

Import the racdiv script

#Script to create the Racial Diversity Index indicator.
from VitalSigns.create import createAcsIndicator, racdiv 

Once the script has been imported, we can now create the Baltimore City indicators.

mergeUrl = 'https://raw.githubusercontent.com/BNIA/VitalSigns/main/CSA2020.csv'
merge_left_col = 'tract'
merge_right_col= 'TRACT20' #For the 2020 CSAs use 'TRACT20', for 2010 CSAs use 'TRACT10'
merge_how = 'outer'

groupBy = 'CSA2020'     #For the 2020 CSAs use 'CSA2020', for 2010 CSAs use 'CSA2010'

method = racdiv
aggMethod = 'sum'
columnsToInclude = []


MyIndicator = createAcsIndicator(state, county, tract, year, tableId,
                    mergeUrl, merge_left_col, merge_right_col, merge_how, groupBy,
                    aggMethod, method, columnsToInclude, finalFileName=False)

MyIndicator.head()

The cell below shows the output while the racdiv script is being run. As you can see on the last line, the script asks the user to re-enter their chosen year. After re-entering the year, the script will finish running, and the racdiv indicator table will be completed.

Table: B02001, Year: 21 imported.
Index(['TRACT20', 'GEOID20', 'CSA2020'], dtype='object')
Merge file imported
Both are now merged.
Aggregating...
Aggregated
Creating Indicator
Please enter your chosen year again (i.e., '17', '20'): 

Now we can save the Baltimore City indicators (Use this method ONLY if you are working in Google Colab. Otherwise, you can save the data however you prefer)

from google.colab import files
MyIndicator.to_csv('YourIndicatorFileName.csv') 
files.download('YourIndicatorFileName.csv')