Skip to content

Latest commit

 

History

History
20 lines (16 loc) · 1.09 KB

README.md

File metadata and controls

20 lines (16 loc) · 1.09 KB

###Lesson 18: Big Data

####Slides

####Handouts

####Spark Installation: If you'd like to follow along with the Spark notebook you can attempt to install Spark locally so that you can run the PySpark manipulations (these instructions are for a Mac only, Windows could be tricky...).

  1. Install Spark on your Mac by running the following: brew install apache-spark
  2. Install findspark: pip install -e . after cloning it from here, and cd findspark)
  3. Install the Java JDK from here
  4. Now when you start an IPython NB, you should have all the tools you need to go find your local Spark installation and interact with it via PySpark. To create a SparkContext within your notebook, you'll have to run something like the following within the notebook:
import findspark
import os
findspark.init()
import pyspark
sc = pyspark.SparkContext()