vighnesh32 / Big-Data-Project Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Data processing and machine learning in cloud.

0 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
project.ipynb		project.ipynb
report.pdf		report.pdf

Repository files navigation

Big-Data-Project

File Name and Decription :

project.ipynb : This file is divided in to five sections which contains the code for preprocessing, parallelising the data preprocessing in Spark using Google Cloud Dataproc, parallelising the measuring of different configurations using Spark for first three sections respectively. For further sections the preprocessed data in Tensorflow/Keras is used and different parallelisation approaches for multiple GPUs are tested. Also, cherrypicking and hybrid parallel training of convolutional networks based on two papers are discussed in the report.
report.pdf - Project report.

Sequence for running the codes :

Open and run the project.ipynb file. Google console will be appropriate to perform the tasks. Also make sure your google account have allocation of the GPUs.

About

Data processing and machine learning in cloud.

python big-data tensorflow machine-learning-algorithms keras gcp parallelization pyspark

Report repository

Releases

No releases published

Packages

No packages published

Languages

Jupyter Notebook 100.0%