###Lesson 18: Big Data
####Slides
####Handouts
####Spark Installation: If you'd like to follow along with the Spark notebook you can attempt to install Spark locally so that you can run the PySpark manipulations (these instructions are for a Mac only, Windows could be tricky...).
- Install Spark on your Mac by running the following:
brew install apache-spark
- Install findspark:
pip install -e .
after cloning it from here, andcd findspark
) - Install the Java JDK from here
- Now when you start an IPython NB, you should have all the tools you need to go find your local Spark installation and interact with it via PySpark. To create a SparkContext within your notebook, you'll have to run something like the following within the notebook:
import findspark
import os
findspark.init()
import pyspark
sc = pyspark.SparkContext()