PAT-Tree-Based Keyword Extraction for Chinese Information Retrieval Lee-Feng Chien Institute of Information Science, Academia Sinica, Taipei, Taiwan, R.O.C. Tel: 886-2-788-3799 ext. 1801 E-mail: lfchien@iis.sinica. edu.tw Fax: 886-2-782-4814 Abstract Considering the urgent need to promote Chinese Information Retrieval, in this paper we will raise the significance of keyword extraction using a new PAT-treebased approach, which is efficient in automatic keyword extraction from a set of relevant Chinese documents. This approach has been successfully applied in several IR researches, such as document classification, book indexing and relevance feedback. Many Chinese language processing applications therefore step ahead from character level to word/phrase level, written sentences. Automatic word extraction from Chinese texts is quite difficult especially for unknown words, such as names, locations, translated terms, technical terms, abbreviations etc [Chen 92]. So far, there is not many successful works on Chinese keyword extraction. However, without efficient keyword extraction, many information retrieval applications, for instance, fill-text searching classification[Croft 87], document ~aloutsos 85], information filtering ~elkins 92] and text summary ~ewis 96], cmnot obtain satisfactory achievements. Therefore, a new efiicient keyword extraction approach which is specially useful in Chinese information retrieval applications is presented in this paper. Traditionally, there are two types of
/lp/association-for-computing-machinery/pat-tree-based-keyword-extraction-for-chinese-information-retrieval-tRjwdNJ5jY