-
Notifications
You must be signed in to change notification settings - Fork 15
Introduction to Python
##Python Shell
Python at it's core is a fancy calculator. Typing python
into the command line will give you a prompt
python
or
ipython
IPython provides a more interactive shell and provides a few nice utility functions that make life easier.
For example, starting with a ?
,
?sys
will give documentation for function. Also, IPython provides autocomplete as well. Additional it provides a few "magic" functions like %time
and %paste
, the former of which times function and the latter allows easy copy and paste of code.
##Defining functions
In Python we define functions by using the def
keyword. The value of the function is given in the return statement.
def addition(x, y):
return x + y
x
and y
are the inputs to the function and the return
keyword tells us what the function returns. The most important thing to note here is that Python respects whitespace, the second line must be indented for this to run.
##Data Structures
Python has a variety of data structures available. The most basic of them are the basic integer, float and strings. Python is dynamically typed, so types do not need to be specified and variables can be respecified to different types.
x = 5.0
x = [5.0]
x = "Five Point 0"
###Lists
Lists are one of the most common data structures in Python.
l = [1,2,3,4] #Setting up a list
l.append(5) #Adding to it using the append function
l += [6] #Adding to it using the addition operation
###Dictionaries
Dictionaries in Python are equivalent to Hash Tables in any other language.
x = {} # Empty Dictionary
y = {'key' : 'value'}
y['key'] = "new-value"
##Loop Constructs
Like any other programming language, Python has loop constructs, the ability to iterate over different elements.
l = [1,2,3,4]
for i in l:
print i
x = [1,2,3,4]
#The following are equivalent
y = []
for i in x:
y.append(i**2)
#OR
y = [i**2 for i in x]
Also, Python can iterate over generators
, these are functions that act as iterators, meaning they give access to elements of a sequence.
For example, xrange
for i in xrange(10):
print i
or enumerate
for i, el in enumerate(x):
print i, el
This will print out the index of the element (starting from 0) and the element.
##Pandas Pandas is a Python library for data analysis that resembles the tools available in R. It is also optimized for fast I/O and data manipulation.
import pandas as pd
data = pd.read_csv('https://github.com/pburkard88/DS_BOS_07/Data/538model/data/2012_poll_data_states.csv', sep='\t')
#Get a preview of the first six rows
data.head()
Get descriptive statistics for any or all columns
data.describe()
Select certain columns
data[['Poll', 'Obama (D)', 'Romney (R)']]
Select a range of rows
data[:3]
data[1:3]
Bin the values of a column
data['Poll'].value_counts()
Aggregate over a particular column
data.groupby('Poll')['Obama (D)'].mean()