Degree
Master of Science in Computer Science
Faculty / School
Faculty of Computer Sciences (FCS)
Department
Department of Computer Science
Date of Submission
2020-06-30
Advisor
Dr. Sajjad Haider, Professor and Chairperson, Department of Computer Science, Institute of Business Administration (IBA), Karachi
Project Type
MSCS Survey Report
Keywords
Named Entity Recognition, Semantic, Maximum Entropy, Computational linguistics, Personally identifiable information
Abstract
Named Entity Recognition (NER) aims to find mentions from text belonging to predefined semantic types like a person, location, organization and others. NER not only acts as a standalone tool for information extraction but also plays a vital role in natural language processing applications including but not limited to text understanding, information retrieval, automatic text summarization, question answering computational linguistics and Personally identifiable information (PII) discovery.
This survey aims to summarize some of the popular techniques developed for Named Entity Recognition. Chapter 1 gives a brief introduction on NER, Chapter 2 explores the Maximum Entropy (ME) models and its usage in NER, Chapter 3 focuses on Hidden Markov Models (HMM). Chapters 4 and 5 describes the Neural models-based techniques, which is the most recent advancement in this field. The work in Chapter 2 - 4 focuses on the English language while in Chapter 5, we discus some of the research done for the Urdu NER.
Notes
Named Entity Recognition (NER) is used to find names belonging to predefined semantic types like person, location, organization or some custom entities which are domain specific.
In this survey, we went through multiple approaches to perform NER on English and Urdu languages. We started from ME based approach, explored HMM based approach, and finally went through utilizing deep learning to solve this problem. For English language, the work present is in abundance but for Urdu, the work is limited and even the training data is limited. With advancement from rule based NER approaches, and gradually shifting towards machine learned approaches have both the efficacy and efficiency of NER, but there still are some challenges remaining for languages like Urdu.
Link to Catalog Record
https://ils.iba.edu.pk/cgi-bin/koha/opac-detail.pl?biblionumber=111679
iRepository Citation
Khan, Muhammad Y.. "A survey on named entity recognition." Unpublished graduate research project. Institute of Business Administration. 2020. https://ir.iba.edu.pk/research-projects-mscs/21
The full text of this document is only accessible to authorized users.