CobamasSensorOD is a framework used to create, train and visualize an autoencoder on sequential multivariate data. It uses PyTorch as a baseline framework for the model and provides a predefined model architecture and training loop. The framework was created to analyse sensor data from wind turbines to find anomalous behaviour.
The predefined training loop requires a torch.utils.data Dataset
. For your own data you need to implement your own dataset.
The only requirement is that the output sample is a 2d tensor
of shape (time_step, feature)
.
The Encoder consists of two parts. The first part is a 1d-convolution-layer, where each feature is considered as one channel and the second part is a lstm-layer used to create the embedding. The convolution-layer will shorten the input-sequence and generate new features as additional channels. The input in the lstm-layer is the output of the convolution-layer.
The model can be used as Model-Object with predefined methods for loading, saving, training and evaluation.
from Model import Model
from ModelFactory import ModelFactory
mf = ModelFactory(seq_len=300, n_features=4, path="run")
model = mf.get_model(
h_conv_channel=[8, 12],
kernel=7,
kernel_stride=[3,1],
embedding_dim=32,
n_lstm=2
)
Alternatively the model can be used as a PyTorch module directly:
from models.Conv1dLSTMAutoencoder import Conv1dLSTMAutoencoder
model = Conv1dLSTMAutoencoder(
seq_len=300,
in_channel=4,
h_conv_channel=[8, 12],
kernel_size=7,
stride=[3,1],
embedding_dim=32,
n_lstm_layer=2
)
ModelFactory
seq_len
(int): Number of time_steps per sample in the dataset. Equal toseq_len
of the PyTorch-Module.n_features
(int): Number of features per time_step per sample in the dataset. Equal toin_channel
of the PyTorch-Module. Defines number of input channels for the first conv1d-layer.path
(str): Root directory for every file-based function such as loading/saving.
Model
h_conv_channel
(list or int): Defines the number of output channels for each conv1d-layer. If list of length n, n-1 hidden conv1d-layers will be added to the model.kernel
(list or int): Defines the kernel size for each conv1d-layer.kernel_stride
(list or int): Defines the kernel stride for each conv1d-layer.n_lstm
(int): Number of lstm-layers. Must be >= 1.embedding_dim
(int): Number of dimensions of the embedding and thus the number of dimensions of the hidden state of the lstm-layer.
Using the predefined Model
-class training can be achieved by calling the train
-Method with a torch.utils.data Dataset
.
model.train(dataset=dataset, n_epochs=10, learning_rate=0.001, batch_size=100, verbose=True)
The model can be saved via the save
-method of the Model
-class and loaded via the load
-method of the ModelFactory
-class.
The model will be saved in a folder based on the model-parameters and in the directory provided in the
path
-parameter of the ModelFactory
-class.
mf = ModelFactory(seq_len=300, n_features=4, path="run")
model = mf.get_model(h_conv_channel=8, kernel=7, kernel_stride=3, embedding_dim=32, n_lstm=1)
model.save()
model = mf.load_model(h_conv_channel=8, kernel=7, kernel_stride=3, embedding_dim=32, n_lstm=1)