This project focuses on the development of deep learning models based on autoencoders for the purpose of anomaly detection. Autoencoders are neural networks used to learn compressed representations of raw data, making them effective tools for detecting anomalies in datasets. The project also involves deploying the trained model as an API using Flask.
The primary objectives of this project include:
- For normal transactions developing a deep learning model based on autoencoders for anomaly detection.
- Deploying the model as an API using Flask for real-time anomaly detection.
The dataset used in this project is a transaction dataset containing information on more than 100,000 transactions, each characterized by several features. This data serves as the foundation for training and testing the deep autoencoder model.
- Language:
Python
- Packages:
Pandas
,Numpy
,Matplotlib
,Keras
,Tensorflow
- API Service:
Flask
,Gunicorn
The project follows a structured approach:
- Understand the business objective and the importance of anomaly detection.
- Perform exploratory data analysis (EDA) to gain insights into the dataset.
- Normalize and clean the data, addressing any missing values through imputation.
- Delve into the theory behind autoencoders and their architecture.
- Build a base autoencoder model using the Keras library.
- Fine-tune the model to extract the best performance for anomaly detection.
- Make predictions using the trained model to identify anomalies.
- Serve the model as an API endpoint using Flask, enabling real-time anomaly detection.
-
input: Contains the dataset files used for analysis (e.g.,
final_cred_data.csv
,Test-data.csv
). -
src: The heart of the project, this folder contains modularized code for various steps, including data preprocessing, model building, and deployment. It consists of the
ML_pipeline
andengine.py
files, each containing functions for different functionalities. -
output: Contains pre-trained models saved as .pkl files. These models can be conveniently loaded and used without the need for retraining.
-
lib: A reference folder with the original IPython notebook.
-
requirements.txt: Lists all required libraries and their versions for easy installation using
pip
.