In this assignment you will select a data set and do some munging, analysis, and visualization of it using pandas
, Jupyter Notebooks, and associated Python-centric data science tools.
First, you will need to select a datafile to work from. For this assignment, please select any reputable data source that is of interest to you. Download the data in a plain text data format, not a spreadsheet-specific file format.
There are many data sources available at NYU Libraries' Data Services division. Use all available resources to identify a data set of interest to yourself.
Save the original raw data file of your choice into the data
directory.
Use JupyterLab to open the Jupyter Notebook named analysis.ipynb. You will import the data file and do all the data munging, analysis, and visualization within this notebook.
Use Visual Studio Code to perform git stage
, commit
and push
actions to submit. These actions are all available as menu items in Visual Studio Code's Source Control panel.
- Type a short note about what you have done to the files in the
Message
area, and then typeCommand-Enter
(Mac) orControl-Enter
(Windows) to perform gitstage
andcommit
actions. - Click the
...
icon next to the words, "Source Control" and select "Push" to perform the gitpush
action. This will upload your work to your repository on GitHub.com.