Data Cleaning Project

Usage:

To replicate the Data Cleaning workflow you can open the jupyter notebook, and run all cells. The notebook uses standard anaconda packages to clean the data, so given that you have an Python Anaconda distribution, you should be able to run it. The notebook will create farmers.db which you can load into sqlite3 and this should contain the clean Farmers dataset.

For more information in the data cleaning workflow you can go trough the notebook preview which explains all steps. To get a information on this assesment you can read the InitialAssesment.pdf provided in the repository.

The python file workflow.py it is also provided to create CleanFarmers.csv directly, and this can be run with in the command line but will not generate the YesWorkFlow graph, to create this you'll need to follow the YesWorkFlow instructions: https://github.com/yesworkflow-org/yw-prototypes

Requirements:

Python 3.7 (Anaconda Distribution)
YesWorkFlow Binaries

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
FarmersWf.png		FarmersWf.png
InitialAssesment.pdf		InitialAssesment.pdf
README.md		README.md
farmers.csv		farmers.csv
farmers.db		farmers.db
workflow.py		workflow.py
wrangling.ipynb		wrangling.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Cleaning Project

Usage:

Requirements:

About

Releases

Packages

Languages

maganaluis/data-cleaning-project

Folders and files

Latest commit

History

Repository files navigation

Data Cleaning Project

Usage:

Requirements:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages