Master of Science in Computer Science
Department of Computer Science
School of Mathematics and Computer Science (SMCS)
Date of Submission
Dr. Sajjad Haider, Professor, Department of Computer Science, School of Mathematics and Computer Science (SMCS)
Natural Language Processing, Text Summarization, Urdu Language Processing, Sentence Weight Algorithm, Weighted Term Frequency, Term Frequency Inverse Document Frequency (TD-IDF), mT5-multilingual-XLSum
Text summarization is a formidable challenge in Natural Language Processing (NLP) because it requires precise text analysis, such as semantic and lexical analysis, to produce a good summary. A good summary must contain valuable information and must be concise while considering aspects such as non-redundancy, relevance, coverage, coherence, and readability.
A lot of research, time, effort, and funding has been invested in the English language, as it is the global language for communication, but not so much in low resource languages like Urdu. This project intends to develop an application that addresses this problem. It also provides Parts of Speech (POS) tagging, which would help users understand the language better. Additionally, it has applications in several industries, for example, newspaper summarization, microblog/tweet summarization, book summarization, biomedical/legal/research documents summarization and so on.
Hassan, Nabeel. "Urdu Text Summarization using Machine Learning." Unpublished graduate research project. Institute of Business Administration. 2022. https://ir.iba.edu.pk/research-projects-mscs/10
The full text of this document is only accessible to authorized users.