Ontology based semantic annotation of Urdu language web documents
Faculty / School
Faculty of Computer Sciences (FCS)
Department of Computer Science
Was this content written or created while at IBA?
18th International Conference on Knowledge-Based and IntelligentInformation & Engineering Systems - KES2014
15-17 September 2014
Procedia Computer Science- Elsevier BV
Abstract / Description
Proliferation of multilingual text on the Internet has increased the demand for efficient information retrieval independent of language. Among variety of languages, the Urdu language is one of the most commonly spoken and written language in South Asia. However, due to unstructured format the access of relevant information is still a big challenge. The semantic web technologies enable the advancement in information retrieval systems by assigning semantics to information. This paper presents a semantic annotation framework that can annotate documents written in Urdu language. The framework uses domain specific ontology and context keywords instead of NLP (Natural Language processing) techniques. The experiment has been conducted to evaluate the presented annotation framework. The set of corpora used in the experiment belong to the online classified ads posted on the online Urdu newspapers. The purpose of this research is to find the challenges involved in semantic annotation of Urdu language web documents.
Rajput, Q. (2014). Ontology based semantic annotation of Urdu language web documents. Procedia Computer Science, 35, 662-670.
Rajput, Q. (2014). Ontology based semantic annotation of Urdu language web documents., 35 (18770509), 662-670. https://doi.org/10.1016/j.procs.2014.08.148