Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

A STATISTICAL INTERPRETATION OF TERM SPECIFICITY AND ITS APPLICATION IN RETRIEVAL

A STATISTICAL INTERPRETATION OF TERM SPECIFICITY AND ITS APPLICATION IN RETRIEVAL The exhaustivity of document descriptions and the specificity of index terms are usually regarded as independent. It is suggested that specificity should be interpreted statistically, as a function of term use rather than of term meaning. The effects on retrieval of variations in term specificity are examined, experiments with three test collections showing in particular that frequentlyoccurring terms are required for good overall performance. It is argued that terms should be weighted according to collection frequency, so that matches on less frequent, more specific, terms are of greater value than matches on frequent terms. Results for the test collections show that considerable improvements in performance are obtained with this very simple procedure. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Documentation Emerald Publishing

A STATISTICAL INTERPRETATION OF TERM SPECIFICITY AND ITS APPLICATION IN RETRIEVAL

Journal of Documentation , Volume 28 (1): 11 – Jan 1, 1972

Loading next page...
 
/lp/emerald-publishing/a-statistical-interpretation-of-term-specificity-and-its-application-HIiyyk03Wd

References (13)

Publisher
Emerald Publishing
Copyright
Copyright © Emerald Group Publishing Limited
ISSN
0022-0418
DOI
10.1108/eb026526
Publisher site
See Article on Publisher Site

Abstract

The exhaustivity of document descriptions and the specificity of index terms are usually regarded as independent. It is suggested that specificity should be interpreted statistically, as a function of term use rather than of term meaning. The effects on retrieval of variations in term specificity are examined, experiments with three test collections showing in particular that frequentlyoccurring terms are required for good overall performance. It is argued that terms should be weighted according to collection frequency, so that matches on less frequent, more specific, terms are of greater value than matches on frequent terms. Results for the test collections show that considerable improvements in performance are obtained with this very simple procedure.

Journal

Journal of DocumentationEmerald Publishing

Published: Jan 1, 1972

There are no references for this article.