Templates for getting started or doing quick prototypes, building Scala applications with Spark and Cassandra.
Currently the repo just has the basic samples. Full application samples addressing particular use cases are coming over the next few days.
Some of these will be in the DataBricks github samples as well.
If you already use Scala skip the SBT step. Similarly, if you already can spin up Cassandra locally, skip that step.
We will be building and running with SBT
git clone https://github.com/helena/spark-cassandra-blueprints.git
cd spark-cassandra-blueprints
sbt compile
All you should have to do is download and open the tar.
Download Apache Cassandra 2.1.0 DataStax Academy - Free video courses on Cassandra!
Many ways to do this. A simple method is
-
Open (or create if you don't have one) ~/.bash_profile
-
Add a Cassandra env to your path $CASSANDRA_HOME/bin
export CASSANDRA_HOME=/Users/helena/cassandra
PATH=$CASSANDRA_HOME/bin:$JAVA_HOME/bin:$SBT_HOME/bin:$SCALA_HOME/bin$PATH
Start Cassandra sudo ./apache-cassandra-2.1.0/bin/cassandra
Open the CQL shell ./apache-cassandra-2.1.0/bin/cqlsh
CREATE KEYSPACE IF NOT EXISTS test WITH REPLICATION = {'class': 'SimpleStrategy', 'replication_factor': 1 };
CREATE TABLE IF NOT EXISTS test.mytable (key TEXT PRIMARY KEY, value INT);
If all went well, you're G2G!
In Production you would use the NetworkTopologyStrategy
and a mimimum replication factor of 3.
NetworkTopologyStrategy
The only dependency required is:
"com.datastax.spark" %% "spark-cassandra-connector" % "1.1.0-alpha2"
and possibly slf4j. The others in SparkCassandraBlueprintBuild.scala are there for other non Spark (core and streaming) and Cassandra code.
More being added...