Ontology based semantic annotation of Urdu language web documents

Faculty / School

Faculty of Computer Sciences (FCS)

Department

Department of Computer Science

Was this content written or created while at IBA?

Yes

Document Type

Conference Paper

Publication Date

2014

Conference Name

18th International Conference on Knowledge-Based and IntelligentInformation & Engineering Systems - KES2014

Conference Location

Gdynia, Poland

Conference Dates

15-17 September 2014

ISBN/ISSN

84924203419 (Scopus)

Volume

35

Issue

18770509

First Page

662

Last Page

670

Publisher

Procedia Computer Science- Elsevier BV

Abstract / Description

Proliferation of multilingual text on the Internet has increased the demand for efficient information retrieval independent of language. Among variety of languages, the Urdu language is one of the most commonly spoken and written language in South Asia. However, due to unstructured format the access of relevant information is still a big challenge. The semantic web technologies enable the advancement in information retrieval systems by assigning semantics to information. This paper presents a semantic annotation framework that can annotate documents written in Urdu language. The framework uses domain specific ontology and context keywords instead of NLP (Natural Language processing) techniques. The experiment has been conducted to evaluate the presented annotation framework. The set of corpora used in the experiment belong to the online classified ads posted on the online Urdu newspapers. The purpose of this research is to find the challenges involved in semantic annotation of Urdu language web documents.

Citation/Publisher Attribution

Rajput, Q. (2014). Ontology based semantic annotation of Urdu language web documents. Procedia Computer Science, 35, 662-670.

Share

COinS