These materials provide an introduction to core concepts for working with data in Python using the pandas data analysis library and were originally used for instructor-led open lab sessions. Each open lab consisted of a guided activity covering specific concepts followed by extended open time for self-guided practice using datasets and activities provided by the instructors. During open time participants had access to instructors and peers for questions and support.
In this lab we will cover an introduction to the Pandas library, including methods for reading, exploring, and writing various data formats such as tab-delimited files, Excel files, and JSON files with Pandas.
To create a copy of the workshop materials, and run the code, click the "Open in Colab" button above, while signed in to a Google account. With the Colab notebook open, click the "Copy to Drive" button to make a copy attached to your own Google account.
In this lab we will cover basic pandas methods for loading, combining, and preparing different types of datasets for analyses with pandas.
To create a copy of the workshop materials, and run the code, click the "Open in Colab" button above, while signed in to a Google account. With the Colab notebook open, click the "Copy to Drive" button to make a copy attached to your own Google account.
In this lab we will cover basic pandas methods for normalizing values, modifying data, dealing with missing data, and working with strings and dates.
To create a copy of the workshop materials, and run the code, click the "Open in Colab" button above, while signed in to a Google account. With the Colab notebook open, click the "Copy to Drive" button to make a copy attached to your own Google account.
In this lab we will cover common exploratory analysis methods such as calculating summary statistics and aggregating and grouping data using pandas.
To create a copy of the workshop materials, and run the code, click the "Open in Colab" button above, while signed in to a Google account. With the Colab notebook open, click the "Copy to Drive" button to make a copy attached to your own Google account.
In this lab we will cover how to use the pandas library with the visualization library matplotlib and other visualization libraries for exploring data.
To create a copy of the workshop materials, and run the code, click the "Open in Colab" button above, while signed in to a Google account. With the Colab notebook open, click the "Copy to Drive" button to make a copy attached to your own Google account.
These materials were developed by Claire Cahoon and Walt Gurley at the NC State University Libraries, adapted from previous workshop materials by Scott Bailey and Simon Wiles, of Stanford Libraries.
The data used in this workshop consist of modified subsets of the Museum of Modern Art's (MoMA) research datasets representing all of the works and artists that have been accessioned into MoMA’s collection and cataloged in their database. The original datasets were accessed via the MoMA collection GitHub repository .
The datasets used in these materials are accessible in the NC State University Libraries Data & Visualization Teaching Datasets repository.