This is the code for the paper: SLA Violation Prediction In Cloud Computing: A Machine Learning Perspective: https://arxiv.org/pdf/1611.10338.pdf
Cloud computing reduces the maintenance costs of services and allows users to access on demand services without being involved in technical implementation details. The relationship between a cloud provider and a customer is governed with a \textit{Service Level Agreement} (SLA) that is established to define the level of the service and its associated costs. SLA usually contains specific parameters and a minimum level of quality for each element of the service that is negotiated between a cloud provider and a customer. The failure of providing the service is called an \textit{SLA violation}.
From a provider's point of view, since penalties have to be paid in case of SLA violation, violations prediction is an essential task. By predicting violations, the provider can reallocate the requests and prevent the violation. On the other hand, and from customer's point of view, predicting the future violations can be equivalent to provider's is trustworthiness. Also, the customer would like to receive the service on demand and without any interruptions. Despite the high availability rates, violations do happen in real world and have caused both the provider and the customer heavy costs. Thus, being able to predict SLA violations favors both the customers and the providers.
To tackle this problem, one can use machine learning models to predict violations. Violation prediction task can be seen as a classification problem. Using a classifier, we can predict whether a coming request will be violated or not. In this work, we explore two machine learning models: Naive Bayes and Random Forest Classifiers to predict SLA violations. Unlike previous works on SLA violation prediction or avoidance, our models are trained on a real world dataset which introduces new challenges that have been neglected in previous works. We test our models using \textit{Google Cloud Cluster trace} as the dataset. This dataset contains 29-day trace of Google's Cloud Compute and was published on 2011.
Since SLA violations are rare events in real world (
We demonstrate that Random forest with SMOTE-ENN re-sampling technique achieves the best performance among other methods with the accuracy of 0.9988% and
If you use this code, please cite us here:
@article{hemmat2016sla,
title={SLA violation prediction in cloud computing: A machine learning perspective},
author={Hemmat, Reyhane Askari and Hafid, Abdelhakim},
journal={arXiv preprint arXiv:1611.10338},
year={2016}
}