Date of Submission
Fall 2023
Supervisor
Dr. Sajjad Haider, Professor, Department of Computer Science, Institute of Business Administration, Karachi
Co-Supervisor
Dr. Ramla Shahid, Professor, Department of Biochemistry, Kohsar University Murree
Committee Member 1
Dr. Tariq Mehmood, Examiner - I, Institute of Business Administration (IBA), Karachi
Committee Member 2
Dr. Imran Rauf, Examiner - II, Institute of Business Administration (IBA), Karachi
Degree
Master of Science in Data Science
Department
Department of Computer Science
Faculty/ School
School of Mathematics and Computer Science (SMCS)
Keywords
Process Characterization, Quality by Design, Machine Learning, Design of Experiments
Abstract
This research aims to highlight the importance of process characterization and quality by design (QbD) in developing bioprocesses. It evaluates the utilization of classical and intensified design of experiments (cDoE and iDoE) data with conventional machine learning models and sequential methods. The objective is to enhance understanding of the process, optimize protocols, and ensure the production of high-quality biopharmaceutical products. The conducted experiments utilize two datasets: a static dataset and a dataset that includes intensified fed-batch fermentations. Data preprocessing involves standardization and handling of missing values using various techniques. Seven machine learning models and three sequential models are trained on the preprocessed cDoE and iDoE datasets and assessed using the cDoE dataset. Key findings include the superior performance of conventional models, particularly the Gradient Boosting Regressor, in predicting CDW. For Product Titer predictions, the Simple RNN emerged as a more effective model, emphasizing the importance of capturing temporal dynamics in bioprocess data. Interestingly, models trained on cDoE data showed higher accuracy in predicting CDW, while the difference was less marked in Product Titer predictions. While cDoE datasets require more experiments, they provide richer insights for model training, particularly for CDW predictions. In contrast, iDoE datasets require fewer experiments and present a trade-off in predictive accuracy for certain CQAs.
Document Type
Restricted Access
Submission Type
Thesis
Recommended Citation
Khan, M. (2023). Developing a Predictive System to model Cell Growth in Biopharmaceutical Production (Unpublished Unpublished graduate thesis). Retrieved from https://ir.iba.edu.pk/etd-ms-ds/2