Skip to content

Latest commit

 

History

History
43 lines (30 loc) · 894 Bytes

README.md

File metadata and controls

43 lines (30 loc) · 894 Bytes

Data Engineer Exercise

The purpose of this exercise is to implement a script for manipulating and aggregating a large dataset through code.

Prerequisites

This project relies on Python 3 and the following packages:

  • pandas
  • os
  • csv
  • datetime

Instructions for installing Python 3 into your computer can be found here: https://www.python.org/downloads/

Installation

  1. Clone the repo into your working directory:
git clone https://github.com/jonadata13/data_engineer_exercise.git
  1. Install Python packages by running the command in your terminal:
pip install [package_name]

Usage

  1. Using your terminal, navigate to the project folder:
cd [path to data_engineer_exercise folder]
  1. Run script.py
python script.py
  1. Verify that two CSV files have been saved to your current working directory:
  • people.csv
  • aquisition_facts.csv