This is the code for the development of the research work published as "Learning under Feature Drifts in Textual Streams". The method enchances MNB classifiers for feature-evolving streams. It consists of two components. The sketch to adaptively select important features. And the ensemble to predict feature value aggregating predictions of experts each modeling a distinct temporal trend. This work was presented in CIKM 2018 Torino Italy.
@inproceedings{melidis2018learning,
title={Learning under feature drifts in textual streams},
author={Melidis, Damianos P and Spiliopoulou, Myra and Ntoutsi, Eirini},
booktitle={Proceedings of the 27th ACM International Conference on Information and Knowledge Management},
pages={527--536},
year={2018}
}
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
- Download a data set and preprocess
- Pass the data set to a MySQL database
- You will need MOA API (https://www.cs.waikato.ac.nz/~abifet/MOA/API/index.html)
- Use your favourite IDE and follow the existing pom
- Understand the options of the method checking the code/ensemble/commandLineOptions.txt
- Build your ensembleWA.jar
- Then run
java -classpath /foo/bar/ensembleWA.jar de.l3s.oscar.Main --verbose true --run_mode EvaluateOffline --collection_location /path2/commandLineOptions.txt --saved_db_title tweets140 --short_text true --learning_algorithm mnb --evaluation_scheme prequential --root_output_directory /path2/output
- IntelliJ - Java IDEA
- Maven - Dependency Management
- [MOA] (https://moa.cms.waikato.ac.nz/) - Massive Online Analysis
- [java-timeseries] (https://github.com/signaflo/java-timeseries) - Time Series Analysis in Java
- Damianos P. Melidis - Idea and Implementation - damianosmel
This project is licensed under the GNU General Public License v3.0 - see the LICENSE.txt file for details
- Jan-Hendrik Zab, Emmanouil Gkatzourias and Maximilian Idahl for active discussion on bugs and features
- Inspiration and help by the work of Dr. Luis Moreira Matias
- Great help by Jacob Rachiele for this TimeSeries Java library
- Funding by DFG OSCAR and ERC ALEXANDRIA projects