Skip to content

Latest commit

 

History

History
15 lines (11 loc) · 761 Bytes

README.md

File metadata and controls

15 lines (11 loc) · 761 Bytes

Big-Data-Analytics

The purpose of the project was to familiarize us with the basic steps of the process followed for applying data mining techniques, namely: collection, preprocessing / cleaning, conversion, application of data mining techniques and evaluation. Implementation was done in the Python programming language using the SciKit Learn and Keras tool. The thesis consists of two (2) tasks related to categorization, duplication detection.

Assignment directions are available in BigData-2020-2021-english.pdf

Two (2) separate competitions have been created for the requirements of the job on the Kaggle platform.

https://www.kaggle.com/c/bigdata2021duplicatedetection/leaderboard https://www.kaggle.com/c/bigdata2021classification/leaderboard