Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 676 Bytes

README.md

File metadata and controls

5 lines (3 loc) · 676 Bytes

Big-Data-Apache-Spark-Projects

Big data is high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation. Apache Spark is a super-fast unified analytics software for large-scale data processing; includes big data and machine learning.

This repository contains a collection of my projects while studying in the Big Data & Data Mining course in college. In my final exam, I created a project to classify air quality in London using the Naive Bayes algorithm and a dataset derived from https://datahub.io/core/london-air-quality.