Purpose – The purpose of this paper is to assign topic‐specific ratings to web pages. Design/methodology/approach – The paper uses power iteration to assign topic‐specific rating values (called relevance ) to web pages, creating a ranking or partial order among these pages for each topic. This approach depends on a set of pages that are initially assumed to be relevant for a specific topic; the spatial link structure of the web pages; and a net‐specific decay factor designated ξ . Findings – The paper finds that this approach exhibits desirable properties such as fast convergence, stability and yields relevant answer sets. The first property will be shown using theoretical proofs, while the others are evaluated through stability experiments and assessments of real world data in comparison with already established algorithms. Research limitations/implications – In the assessment, all pages that a web spider was able to find in the Nordic countries were used. It is also important to note that entities that use domains outside the Nordic countries (e.g..com or.org) are not present in the paper's datasets even though they reside logically within one or more of the Nordic countries. This is quite a large dataset, but still small in comparison with the entire worldwide web. Moreover, the execution speed of some of the algorithms unfortunately prohibited the use of a large test dataset in the stability tests. Practical implications – It is not only possible, but also reasonable, to perform ranking of web pages without using Markov chain approaches. This means that the work of generating answer sets for complex questions could (at least in theory) be divided into smaller parts that are later summed up to give the final answer. Originality/value – This paper contributes to the research on internet search engines.
International Journal of Web Information Systems – Emerald Publishing
Published: Nov 21, 2008
Keywords: Worldwide web; Information retrieval; Spatial data structures; Search engines