Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Term conflation methods in information retrieval Non‐linguistic and linguistic approaches

Term conflation methods in information retrieval Non‐linguistic and linguistic approaches Purpose – To propose a categorization of the different conflation procedures at the two basic approaches, non‐linguistic and linguistic techniques, and to justify the application of normalization methods within the framework of linguistic techniques. Design/methodology/approach – Presents a range of term conflation methods, that can be used in information retrieval. The uniterm and multiterm variants can be considered equivalent units for the purposes of automatic indexing. Stemming algorithms, segmentation rules, association measures and clustering techniques are well evaluated non‐linguistic methods, and experiments with these techniques show a wide variety of results. Alternatively, the lemmatisation and the use of syntactic pattern‐matching, through equivalence relations represented in finite‐state transducers (FST), are emerging methods for the recognition and standardization of terms. Findings – The survey attempts to point out the positive and negative effects of the linguistic approach and its potential as a term conflation method. Originality/value – Outlines the importance of FSTs for the normalization of term variants. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Documentation Emerald Publishing

Term conflation methods in information retrieval Non‐linguistic and linguistic approaches

Loading next page...
 
/lp/emerald-publishing/term-conflation-methods-in-information-retrieval-non-linguistic-and-Mwy8yw9pnl

References (111)

Publisher
Emerald Publishing
Copyright
Copyright © 2005 Emerald Group Publishing Limited. All rights reserved.
ISSN
0022-0418
DOI
10.1108/00220410510607507
Publisher site
See Article on Publisher Site

Abstract

Purpose – To propose a categorization of the different conflation procedures at the two basic approaches, non‐linguistic and linguistic techniques, and to justify the application of normalization methods within the framework of linguistic techniques. Design/methodology/approach – Presents a range of term conflation methods, that can be used in information retrieval. The uniterm and multiterm variants can be considered equivalent units for the purposes of automatic indexing. Stemming algorithms, segmentation rules, association measures and clustering techniques are well evaluated non‐linguistic methods, and experiments with these techniques show a wide variety of results. Alternatively, the lemmatisation and the use of syntactic pattern‐matching, through equivalence relations represented in finite‐state transducers (FST), are emerging methods for the recognition and standardization of terms. Findings – The survey attempts to point out the positive and negative effects of the linguistic approach and its potential as a term conflation method. Originality/value – Outlines the importance of FSTs for the normalization of term variants.

Journal

Journal of DocumentationEmerald Publishing

Published: Aug 1, 2005

Keywords: Information retrieval; Document management; Indexing; Variance reduction

There are no references for this article.