Discovering irrelevance in the blogosphere through blog search

Faculty / School

Faculty of Computer Sciences (FCS)

Department

Department of Computer Science

Was this content written or created while at IBA?

Yes

Document Type

Conference Paper

Publication Date

9-19-2011

Conference Name

2011 International Conference on Advances in Social Networks Analysis and Mining

Conference Location

Kaohsiung, Taiwan

Conference Dates

25-27 July 2011

ISBN/ISSN

80052742654 (Scopus)

First Page

457

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Abstract / Description

Web 2.0 technologies have given birth to the blogosphere, which is an information sharing medium by the users for the users. Furthermore, these technologies have also expanded the search problem to a new form of search known as blog search. Similar to Web search, blog search has been affected by spam which affects the quality of search results. This paper approaches the relevant blog problem in the top search results against the general topic queries. It pursues a study of irrelevant blogs appearing in the top search results of Google Blog Search for the blogspot domains. We define metrics for irrelevant blogs by observing the qualitative relevance of content and by analyzing the link structure of those blogs. Our preliminary results show an overall recall of 0.875 with a precision of 1.0 for finding irrelevant blogs in the top 15 search results against six general topic queries on Google Blog Search.

Share

COinS