Skip to content

Latest commit

 

History

History
52 lines (26 loc) · 1.5 KB

README.md

File metadata and controls

52 lines (26 loc) · 1.5 KB

Requirements to run this tutorial :

To follow this tutorial you need to have the following packages installed:

⦁ Python version 2.6-2.7 or 3.3-3.5

⦁ pandas version 0.18.0 or later: http://pandas.pydata.org/ (previous versions will work for most examples as well)

⦁ numpy version 1.7 or later: http://www.numpy.org/

⦁ matplotlib version 1.3 or later: http://matplotlib.org/

⦁ ipython version 3.x with notebook support, or ipython 4.x combined with jupyter: http://ipython.org

⦁ I recommend to use the conda environment manager to install all the requirements (you can install miniconda or install the (very large) Anaconda software distribution, found at http://continuum.io/downloads).

Once this is installed, the following command will install all required packages in your Python environment: conda install pandas jupyter seaborn

But of course, using another distribution (e.g. Enthought Canopy) or pip is good as well, as long as you have the above packages installed.

To Download the tutorial material-

If you have git installed, you can get the material in this tutorial by cloning this repo:

git clone https://github.com/BooraAnnu/Pandas-Tutorial

Content :

 1) Introduction

 2) Data structures

 3) File related operations on DataFrames

 a) CSV Files

 b) JSON Files

 4) Data operations using Pandas

 a) Handle missing data

 b) Cleaning data of wrong format

 c) Cleaning wrong data

 d) Removing duplicate

 5) Data Correlation

 6) Data Visualization