Metasearch: Rank vs. Score Based Rank List Fusion Methods (without Training Data)

M. Elena Renda and Umberto Straccia


Istituto di Elaborazione della Informazione - C.N.R.

Technical Report Number: 2002-TR-07
Via G. Moruzzi,1
I-56124 Pisa (PI) ITALY


Abstract. Given a set of rankings (a ranking is a linear ordering of a set of items), the task of ranking fusion is the problem of combining these lists in such a way to optimize the performance of the combination. The ranking fusion problem is encountered in many situations, one prominent of which is metasearch. It deals with the problem of combining the result lists returned by multiple search engines in response to a given query, where each item in a result list is ordered with respect to a search engine and query dependent relevance score. Several ranking fusion methods have been proposed in the literature. They can be classified based on whether: (i) they rely on the rank; (ii) they rely on the score; and (iii) they require training data or not. Preliminary experimental results seem to indicate that score based methods outperform rank based methods, while methods based on training data perform better than those without training data. In this paper we will compare rank and score based methods, without training data, in the context of metasearch. Our paper will make the following contributions: (i) we will report experimental results for the Markov chain rank based methods, for which no large experimental tests have yet been made; (ii) while it is believed that the rank based method, named Borda Count, is competing with score based methods, we will show that this is not true for metasearch; and (iii) we will show that Markov chain based methods compete with score based methods. This is especially important in the context of metasearch as scores are usually not available from the search engines.


For full paper: please contact me (Elena.Renda_AT_iit.cnr.it).