A novel framework for concept drift detection using autoencoders for classification problems in data streams

Author Affiliation

  • Usman Ali is a PhD Scholar at IBA Karachi
  • Dr. Tariq Mahmood is a Professor at Department of Computer Science, IBA Karachi

Faculty / School

School of Mathematics and Computer Science (SMCS)

Department

Department of Computer Science

Was this content written or created while at IBA?

Yes

Document Type

Article

Source Publication

International Journal of Machine Learning and Cybernetics

ISSN

1868-8071

Keywords

Concept drift, Machine learning, Autoencoder, Data stream, Deep learning

Disciplines

Artificial Intelligence and Robotics | Data Science | Other Computer Sciences

Abstract

In streaming data environments, data characteristics and probability distributions are likely to change over time, causing a phenomenon called concept drift, which poses challenges for machine learning models to predict accurately. In such non-stationary environments, there is a need to detect concept drift and update the model to maintain an acceptable predictive performance. Existing approaches to drift detection have inherent problems like requirements of truth labels in supervised detection methods and high false positive rate in case of unsupervised drift detection. In this paper, we propose a semi-supervised Autoencoder based Drift Detection Method (AEDDM) aimed at detecting drift without the need of truth labels, yet with a high confidence that the detected drift is real. In a binary classification setting, AEDDM uses two autoencoders in a layered architecture, trained on labelled data and uses a thresholding mechanism based on reconstruction error to signal the presence of drift. The proposed method has been evaluated on four synthetic and four real world datasets with different drifting scenarios. In case of real-world datasets, the induced and detected drifts have been evaluated from classifier’s performance viewpoint using seven mostly used batch classifiers as well as from adaptation perspective in an online learning environment using Hoeffding Tree classifier. The results show that AEDDM affectively detects the distributional changes in data which are most likely to impact the classifier’s performance (real drift) while ignoring the virtual drift thus considerably reducing the false alarms with an ability to adapt in terms of classification performance

Indexing Information

HJRS - W Category, Scopus, Web of Science - Emerging Sources Citation Index (ESCI), Web of Science - Science Citation Index Expanded (SCI)

Journal Quality Ranking

2.7, Q2, Indexed

Publication Status

Published

Rights Information

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

This document is currently not available here.

Find in your library

Share

COinS