Search

Filter

  • Advanced Filters:

  • to
  • Specific Data Sources:

    All Edit

    Select All  |  Select None

Reset filters

DOCTORAL ABSTRACT E €ective Focused Retrieval by Exploiting Query Context and Document Structure Rianne Kaptein University of Amsterdam amkaptein@hotmail.com October 6, 2011 Abstract The classic IR model of the search process consists of three elements: query, documents and search results. A user looking to ful l an information need formulates a query usually consisting of a small set of keywords summarising the information need. The goal of an IR system is to retrieve documents containing information which might be useful or relevant to the user. Throughout the search process there is a loss of focus, because keyword queries entered by users often do not suitably summarise their complex information needs, and IR systems do not su ƒciently interpret the contents of documents, leading to result lists containing irrelevant and redundant information. The main research objective of this thesis is to exploit query context and document structure to provide for more focused retrieval. The short keyword query used as input to the retrieval system can be supplemented with topic categories from structured Web resources such as DMOZ and Wikipedia. Topic categories can be used as query context to retrieve documents that are not only relevant to the query but also belong to a relevant topic category. Category information is especially useful for the task of entity ranking where the user is searching for a certain type of entity such as companies or persons. Category information can help to improve the search results by promoting in the ranking pages belonging to relevant topic categories, or categories similar to the relevant categories. By following external links and searching for the retrieved Wikipedia entities in a general Web collection, we can also exploit the structure of Wikipedia to rank entities on the general Web. Wikipedia, in contrast to the general Web, does not contain much redundant information. This absence of redundant information can be exploited by using Wikipedia as a pivot to search the general Web. A typical query returns thousands or millions of documents, but searchers hardly ever look beyond the rst result page. Since space on the result page is limited, we can show only a few documents in the result list. Word clouds can be used to summarise groups of documents into a set of keywords which allows users to quickly get a grasp on the underlying data. Instead of using user-assigned tags we generate word clouds from the textual contents of documents themselves as well as the anchor text of Web documents. Improvements over word clouds that are created using simple term frequency counting include using a parsimonious term weighting scheme, including bigrams and biasing the word cloud towards the query. We nd that word clouds can to a certain degree quickly convey the topic and relevance of a set of search results. Available online at: http://dare.uva.nl/record/395691 ACM SIGIR Forum Vol. 45 No. 2 December 2011

Page 1 of 1

Page 1 of 1

Toggle back to continuous viewing mode

/lp/association-for-computing-machinery/effective-focused-retrieval-by-exploiting-query-context-and-document-mcMxaO3urr
Welcome to DeepDyve! Rent Premier Research Articles and Save Up to 90%

Learn more

Free Article

Bookmark

Effective focused retrieval by exploiting query context and document structure

Kaptein, Rianne
ACM SIGIR Forum , Volume 45 (2)
Association for Computing MachineryJan 9, 2012

More Info

More Like This Article

View All dataSource[]=actageo&dataSource[]=aspet&dataSource[]=aaos&dataSource[]=aacc&dataSource[]=aacr&dataSource[]=aea&dataSource[]=aip&dataSource[]=ajnr&dataSource[]=ams&dataSource[]=aps_physical&dataSource[]=appi_book&dataSource[]=appi_journal&dataSource[]=apha&dataSource[]=asip&dataSource[]=asm&dataSource[]=asn&dataSource[]=aspb&dataSource[]=avs&dataSource[]=annual_reviews&dataSource[]=arxiv&dataSource[]=acm&dataSource[]=berghahn&dataSource[]=cabi&dataSource[]=clinical_trials&dataSource[]=dailymed&dataSource[]=degruyter&dataSource[]=du_press&dataSource[]=esa&dataSource[]=eu_press&dataSource[]=elsevier&dataSource[]=emerald&dataSource[]=ejtr&dataSource[]=emea&dataSource[]=epo&dataSource[]=faseb&dataSource[]=gsa&dataSource[]=health_affairs&dataSource[]=hindawi&dataSource[]=imanager&dataSource[]=imedpub&dataSource[]=informa_healthcare&dataSource[]=informs&dataSource[]=iop&dataSource[]=iucr&dataSource[]=iospress&dataSource[]=jbjs&dataSource[]=leftcoast&dataSource[]=lu_press&dataSource[]=mesharpe&dataSource[]=mary_ann_liebert&dataSource[]=medline&dataSource[]=mit_press&dataSource[]=nature&dataSource[]=oxford&dataSource[]=pier_professional&dataSource[]=pnas&dataSource[]=portlandpress&dataSource[]=psyc_articles&dataSource[]=psyc_books&dataSource[]=psyc_critiques&dataSource[]=plos_journal&dataSource[]=pubmed_central&dataSource[]=rsna&dataSource[]=rockefeller&dataSource[]=rcn&dataSource[]=ria&dataSource[]=rsc&dataSource[]=sage&dataSource[]=spie&dataSource[]=springer_journal&dataSource[]=springer&dataSource[]=taylor_francis&dataSource[]=aps&dataSource[]=the_scientist&dataSource[]=uc_press&dataSource[]=uspto_abstract&dataSource[]=wiley&dataSource[]=pct

Browse: Subject Areas | Journals | Publishers

Sign Up for a DeepDyve Account

Bookmark an Article

To bookmark an article, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK

Subscribe to Journal Email Alerts

To subscribe to email alerts, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK

Thank you for renting with DeepDyve

Your PayPal account has been charged $2.99. You now have access to the full text of this article. A rental receipt has also been sent to your email address.

Your credit card has been charged $2.99. You now have access to the full text of this article. A rental receipt has also been sent to your email address.

OK

New! You can now keep track of new articles from ACM SIGIR Forum on your personalized homepage! Learn more

PDF Download — Not Available

Thanks for your interest in purchasing the PDF. Your request has been noted and we will work with our publisher partner to discuss enabling this feature.

In the meantime, you can get the PDF by visiting the publisher site.

Thank you for purchasing with DeepDyve

Your PayPal account has been charged $.

Your credit card has been charged $.

You can now download this article. A purchase receipt has also been sent to your email address.

Download This Article or I'm done with my download

Print Page — Not Available

Thanks for your interest in printing individual pages. Your request has been noted and we will work with our publisher partner to discuss enabling this feature.

In the meantime, you can get the PDF by visiting the publisher site.

Thank you for printing with DeepDyve

Your PayPal account has been charged $0.

Your credit card has been charged $0.

You can now print this article. A purchase receipt has also been sent to your email address.

Print the Selected Pages or I'm done with my printing

Please refresh to generate a new download link

Your article download link has expired. Please refresh this page to obtain a new download link and try again.

Follow a Journal

To get new article updates from a journal on your personalized homepage, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK