A novel framework for concept drift detection using autoencoders for classification problems in data streams
Faculty / School
School of Mathematics and Computer Science (SMCS)
Department
Department of Computer Science
Was this content written or created while at IBA?
Yes
Document Type
Article
Source Publication
International Journal of Machine Learning and Cybernetics
ISSN
1868-8071
Keywords
Concept drift, Machine learning, Autoencoder, Data stream, Deep learning
Disciplines
Artificial Intelligence and Robotics | Data Science | Other Computer Sciences
Abstract
In streaming data environments, data characteristics and probability distributions are likely to change over time, causing a phenomenon called concept drift, which poses challenges for machine learning models to predict accurately. In such non-stationary environments, there is a need to detect concept drift and update the model to maintain an acceptable predictive performance. Existing approaches to drift detection have inherent problems like requirements of truth labels in supervised detection methods and high false positive rate in case of unsupervised drift detection. In this paper, we propose a semi-supervised Autoencoder based Drift Detection Method (AEDDM) aimed at detecting drift without the need of truth labels, yet with a high confidence that the detected drift is real. In a binary classification setting, AEDDM uses two autoencoders in a layered architecture, trained on labelled data and uses a thresholding mechanism based on reconstruction error to signal the presence of drift. The proposed method has been evaluated on four synthetic and four real world datasets with different drifting scenarios. In case of real-world datasets, the induced and detected drifts have been evaluated from classifier’s performance viewpoint using seven mostly used batch classifiers as well as from adaptation perspective in an online learning environment using Hoeffding Tree classifier. The results show that AEDDM affectively detects the distributional changes in data which are most likely to impact the classifier’s performance (real drift) while ignoring the virtual drift thus considerably reducing the false alarms with an ability to adapt in terms of classification performance
Indexing Information
HJRS - W Category, Scopus, Web of Science - Emerging Sources Citation Index (ESCI), Web of Science - Science Citation Index Expanded (SCI)
Journal Quality Ranking
2.7, Q2, Indexed
Recommended Citation
Ali, U., & Mahmood, T. (2024). A novel framework for concept drift detection using autoencoders for classification problems in data streams. International Journal of Machine Learning and Cybernetics, 16, 397-418. Retrieved from https://ir.iba.edu.pk/faculty-research-articles/253
Publication Status
Published
Rights Information
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
