Book Chapter or Conference Paper Title
Modeling POS Tagging for the Urdu Language
Faculty / School
Faculty of Computer Sciences (FCS)
Department of Computer Science
Was this content written or created while at IBA?
2020 International Conference on Emerging Trends in Smart Technologies, ICETST 2020
26-27 March 2020
Institute of Electrical and Electronics Engineers (IEEE)
Abstract / Description
This paper presents a Parts-of-Speech (POS) tagger for a low resourced 'Urdu' language. POS tagging is a primary preprocessing step in many natural language processing tasks such as sentiment classification, syntactic parsing and named-entity recognition. The proposed taggers make use of the two state-of-the-art models widely used for sequential tagging: Conditional Random Field (CRF) and the Bidirectional long short-term memory CRF (BiLSTM CRF). This work is the first instance of applying BiLSTM CRF model for POS tagging in the Urdu language. Both models achieved the F1 score of 96% on the test data, thus outperforming existing Urdu POS tagger with a significant margin.
Nasim, Z., Abidi, S., & Haider, S. (2020). Modeling POS Tagging for the Urdu Language. https://doi.org/10.1109/ICETST49965.2020.9080721