Student Name

Rabiya OwaisFollow


Master of Science in Data Science


Department of Computer Science

Faculty/ School

School of Mathematics and Computer Science (SMCS)

Date of Submission

Winter 2022


Dr. Tariq Mahmood, Professor, Department of Computer Science, School of Mathematics and Computer Science (SMCS)


Data analytics is growing at a rapid pace. Organizations generate a vast amount of data daily and need it analyzed to take business decisions. The increase in digitized solutions has led to a rapid increase in diverse datasets which later bring challenges in data ingestion, data storage and data structuring. The project discussed in this report is a proof of concept for the implementation of real-time data ingestion with metadata monitoring using Apache Kafka and Lyft’s Amundsen. The language used for coding is python and SQL. This project aims to equip data scientists, data analysts and business users with information about raw data ingested in real-time, stored and formatted in a structured form in on-premises databases.

Document Type

Restricted Access

Submission Type

Research Project

Available for download on Saturday, October 31, 2026

The full text of this document is only accessible to authorized users.