benchspark

An extensible toolset for Spark performance benchmarking.

Currently available Spark jobs (including dataset generators):

Compilation

To compile the jobs to a jar file:

cd spark
sbt package

Adjust run_scripts/submit_local_job to your local setup and execute it.
Later you can extend the script to submit jobs to a cluster that is available to you, be that in a public cloud or an on-premise setup.