Evaluating XML Retrieval Effectiveness at INEX
Mounia Lalmas and Anastasios Tombros
Queen Mary, University of London,
Mile End Road, London, UK
The INitiative for the Evaluation of XML retrieval (INEX) was set up in 2002 to establish an in-
frastructure and provide means, in the form of large test collections and appropriate scoring methods,
for evaluating the effectiveness of content-oriented XML retrieval systems. This report provides an
overview of the evaluation methodology developed in INEX from 2002 to 2006.
XML retrieval systems have been and are being developed to implement content-oriented retrieval ap-
proaches to XML documents. A common feature of these systems, which makes them different to
traditional document retrieval systems, is that, instead of retrieving whole documents, XML retrieval
systems aim at retrieving document components, i.e. XML elements of varying granularity that fulﬁll
the user’s query. As the number of XML retrieval systems increases, so is the need to evaluate their
beneﬁt to the users.
The predominant approach to evaluate a system retrieval effectiveness is with the use of test collec-
tions and effectiveness scoring methods. The INitiative for the Evaluation of XML retrieval (INEX)
[3, 4, 6, 5, 7] was set up in 2002 to establish an infrastructure and provide means, in the form of large
test collections and appropriate scoring methods, for evaluating content-oriented XML retrieval systems.
This report provides an overview of the evaluation methodology developed in INEX from 2002 to 2006.
Section 2 describes the INEX test collections and Section 3 describes the scoring methods used to mea-
sure effectiveness. We conclude in Section 4. We only report the evaluation of the ad hoc track
2 The INEX test collections
XML documents organize their content into small, nested structural elements. Each of these elements
in the document’s hierarchy, along with the document itself (the root of the hierarchy), represent a
retrievable unit. With the use of XML query languages, users of an XML IR system can express their
information need as a combination of content and structural conditions. Consequently, the relevance
Some of the INEX tracks are described in separate reports of this forum.
ACM SIGIR Forum 40 Vol.41 No.1 June 2007