Skip to content
/ piccard Public

Visualizing demographic evolution using geographically inconsistent census data

License

Notifications You must be signed in to change notification settings

lodi-m/piccard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Piccard

Introduction

piccard is a Python package which provides an alternative framework to traditional harmonization techniques for combining spatial data with inconsistent geographic units across multiple years. It uses a network representation containing nodes and edges to retain all information available in the data. Nodes are used to represent all the geographic areas (e.g., census tracts, dissemination areas) for each year. An edge connects two nodes when the geographic area corresponding to the tail node has at least a 5% area overlap with the geographic area corresponding to the head node in the previous available year.

Research

The method behind this package can be found in the following research paper:
      Dias, F., & Silver, D. (2018). Visualizing demographic evolution using geographically inconsistent census data. California Digital Library (CDL). https://doi.org/10.31235/osf.io/a3gtd

Installation

The latest released version is available at the Python Package Index (PyPI)

pip install piccard

Importing the package

from piccard import piccard as pc 

Useful Functions

piccard.preprocessing(ct_data, year, id)
      Return a cleaned GeoDataFrame of the input data with a new column showing the area of each census tract.

piccard.create_network(census_dfs, years, id, threshold=0.05)
      Creates a network representation of the temporal connections present in census_dfs over years when each yearly geographic area has at most threshold percentage of overlap with its corresponding area(s) in the next year.

piccard.create_network_table(census_dfs, years, id, threshold=0.05)
      Return the final network table with all the temporal connections present in census_dfs over years when each yearly geographic area has at most threshold percentage of overlap with its corresponding area(s) in the next year.

piccard.draw_subnetwork(network_table, G, num_cts=4)
      Draws a subgraph of the network representation where num_cts is the number of census tracts in the first census year which are followed through all census years.

piccard.plot_num_cts(network_table, years, id)
      Plots the number of census tracts across all given census years.

Note: Further explanation of the parameters and example code for all the above functions can be found in the documentation.

Dependencies

GeoPandas - Allows spatial operations in Python, making it easier to work with geospatial data
Matplotlib - a comprehensive library for creating visualizations
NetworkX - Adds support for analyzing networks represented by nodes and edges
NumPy - Adds support for large, multi-dimensional arrays and matrices, with functions to operate on these arrays
pandas - Offers data structures and operations for manipulating numerical tables

Authors

Maliha Lodi, Fernando Calderon Figueroa, Daniel Silver

About

Visualizing demographic evolution using geographically inconsistent census data

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages