Degree

Doctor of Philosophy in Computer Science

Faculty / School

School of Mathematics and Computer Science (SMCS)

Department

Department of Computer Science

Date of Award

Spring 2026

Advisor

Dr. Tariq Mahmood, Professor, School of Mathematics and Computer Science (SMCS)

Committee Member 1

Dr. Sohail Asghar, Examiner – I, COMSATS Islamabad

Committee Member 2

Dr. Ahmer Rashid, Examiner – II, GIKI

Committee Member 3

Dr. Muhammad Atif Tahir, Program Coordinator, Graduate & Postgraduate Programs (CS), IBA Karachi

Project Type

Dissertation

Access Type

Restricted Access

Document Version

Final

Pages

xiv, 188

Keywords

Concept Drift, Machine Learning, Autoencoder, Data Streams, Deep Learning, Drift Detection, Drift Adaptation

Subjects

Artificial Intelligence, Computer Science, Data Science

Abstract

This research work addresses the problem of unsupervised concept drift detection i.e., drift detection without the need of truth labels with reduced false alarms. To address this, we established an autoencoder based drift detection framework (which can be followed by any standard drift adaptation mechanism) for machine learning based classification problems in data streams. In streaming data environments, data characteristics and probability distributions are likely to change over time, causing a phenomenon called concept drift, which poses challenges for machine learning models to predict accurately. In such non-stationary environments, there is a need to detect concept drift and update the model to maintain an acceptable predictive performance. Existing approaches to drift detection have inherent problems like requirements of truth labels in supervised detection methods and high false positive rate in case of unsupervised drift detection. This research presents a novel semi-supervised Autoencoder based Drift Detection Method (AEDDM) aimed at detecting drift without the need of truth labels with reduced false alarms.

The developed AEDDM method works in a batch mode and has three architectural components; an offline component (training phase) where two autoencoders are trained on labelled data to learn the data distribution of each class and two different thresholds namely batch threshold and count threshold are computed from the reconstruction error values of the validation data; an ensemble component which defines the sequential order of the autoencoders; and an online component where data arrives in batches and drift detection is performed for the whole batch data stream by comparing changes in reconstruction loss values with thresholds learned in the offline training phase.

AEDDM is considered as a semi-supervised drift detection method since it leverages both labelled and unlabeled data in its complete framework. While labeled training data is required in the initial training of the autoencoders during the offline phase, there is no need for class labels in online detection phase. Although the drift is detected in a completely unsupervised way in online detection phase, considering the whole framework, the method is considered as a semi-supervised drift detection method.

The AEDDM method is assessed on a combination of four synthetic and four real-world datasets, which exhibited both sudden and gradual changes in the data distribution. To evaluate the method's effectiveness, it was tested on seven popular batch classifiers and a Hoeffding’s Tree classifier in an online learning setting. The results indicate that AEDDM accurately identifies distributional changes that are likely to degrade classifier performance xiv (real drift), while disregarding irrelevant changes (virtual drift). AEDDM demonstrated drift detection with zero delay in 6 out of 8 datasets while with a delay of up to 4 batches in 2 datasets. In the case of real world datasets and real drift (both sudden and gradual) AEDDM detected drift in all four datasets while in the case of virtual drift (both sudden and gradual) it ignored the drift in 6 out of 8 cases. In operational scenario, AEDDM outperformed four methods namely ADD, DD-SAPH, Prequential HT and No-Update in all four real-world datasets while outperformed KS-Test in three out of four datasets and shared the best rank in one dataset. This ability to detect both sudden and gradual drifts, distinguish between real and virtual drifts, coupled with its adaptability to changing data distributions (based on adaptation), makes AEDDM a valuable tool for maintaining classifier performance in dynamic environments.

Within the field of drift detection, AEDDM is a novel and a comprehensive work that leverages the power of deep learning specifically autoencoders. It is designed considering the characteristics of an ideal drift detector after careful review of supervised, semi-supervised, unsupervised, and deep learning-based techniques. It is probably the first method that integrates the best part of each method; the detected drift through AEDDM is real as incase of supervised drift detection methods, available labelled data is fully leveraged for autoencoder’s training and threshold computations like in semi-supervised drift detection methods, drift detection is done in completely unsupervised way similar to unsupervised drift detection methods, and the power of deep learning is harnessed to process multidimensional data eliminating the needs of any feature selection or dimensionality reduction.

Recommended Citation

Ali, U. (2026). A Framework for Concept Drift Detection and Adaptation for Classification Problems in Data Streams (Unpublished doctoral dissertation). Institute of Business Administration, Pakistan. Retrieved from https://ir.iba.edu.pk/etd/98

Download

The full text of this document is only accessible to authorized users.

COinS

All Theses and Dissertations

A Framework for Concept Drift Detection and Adaptation for Classification Problems in Data Streams

Degree

Faculty / School

Department

Date of Award

Advisor

Committee Member 1

Committee Member 2

Committee Member 3

Project Type

Access Type

Document Version

Pages

Keywords

Subjects

Abstract

Recommended Citation

Browse

Search

Author Corner

LINKS

All Theses and Dissertations

A Framework for Concept Drift Detection and Adaptation for Classification Problems in Data Streams

Author

Degree

Faculty / School

Department

Date of Award

Advisor

Committee Member 1

Committee Member 2

Committee Member 3

Project Type

Access Type

Document Version

Pages

Keywords

Subjects

Abstract

Recommended Citation

Share

Browse

Search

Author Corner

LINKS