Technical Papers Parallel Session-V: Semantic indexing for XML documents using RDBMS

Abstract/Description

Indexing is a common technique used by search engines for a fast and efficient search and retrieval process. XML search engines are no different. But the search engines consider XML file as single unit completely ignoring the fact that XML document contains records in the form of semi-structured data. This hierarchal structure of XML inherits a parent/child relationship. This relationship make two tags semantically related, if they share same parent. Indexing document as a whole, result in low precision. This paper proposes an indexing scheme that preserves the parent/child relation information using document structure. The information is then used to identify the semantic relation between items. Semantic based search and retrieval on XML documents can provide more accurate results. The paper uses RDBMS to store these indices in the form of table. To check the accuracy of the proposed scheme, a case study is also performed using two queries on a sample XML file that share semantically related data.

Location

C-10, AMAN CED

Session Theme

Technical Papers Parallel Session-V (Information Retrieval)

Session Type

Parallel Technical Session

Session Chair

Dr. Imran Hayee

Start Date

13-12-2015 3:30 PM

End Date

13-12-2015 3:50 PM

Share

COinS
 
Dec 13th, 3:30 PM Dec 13th, 3:50 PM

Technical Papers Parallel Session-V: Semantic indexing for XML documents using RDBMS

C-10, AMAN CED

Indexing is a common technique used by search engines for a fast and efficient search and retrieval process. XML search engines are no different. But the search engines consider XML file as single unit completely ignoring the fact that XML document contains records in the form of semi-structured data. This hierarchal structure of XML inherits a parent/child relationship. This relationship make two tags semantically related, if they share same parent. Indexing document as a whole, result in low precision. This paper proposes an indexing scheme that preserves the parent/child relation information using document structure. The information is then used to identify the semantic relation between items. Semantic based search and retrieval on XML documents can provide more accurate results. The paper uses RDBMS to store these indices in the form of table. To check the accuracy of the proposed scheme, a case study is also performed using two queries on a sample XML file that share semantically related data.