Big-Data-Apache-Spark-Projects

Big data is high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation. Apache Spark is a super-fast unified analytics software for large-scale data processing; includes big data and machine learning.

This repository contains a collection of my projects while studying in the Big Data & Data Mining course in college. In my final exam, I created a project to classify air quality in London using the Naive Bayes algorithm and a dataset derived from https://datahub.io/core/london-air-quality.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Big-Data-Apache-Spark-Projects

Files

README.md

Latest commit

History

README.md

File metadata and controls

Big-Data-Apache-Spark-Projects