Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Indexing Arabic texts using association rule data mining

Indexing Arabic texts using association rule data mining The purpose of this paper is to propose a new model to enhance auto-indexing Arabic texts. The model denotes extracting new relevant words by relating those chosen by previous classical methods to new words using data mining rules.Design/methodology/approachThe proposed model uses an association rule algorithm for extracting frequent sets containing related items – to extract relationships between words in the texts to be indexed with words from texts that belong to the same category. The associations of words extracted are illustrated as sets of words that appear frequently together.FindingsThe proposed methodology shows significant enhancement in terms of accuracy, efficiency and reliability when compared to previous works.Research limitations/implicationsThe stemming algorithm can be further enhanced. In the Arabic language, we have many grammatical rules. The more we integrate rules to the stemming algorithm, the better the stemming will be. Other enhancements can be done to the stop-list. This is by adding more words to it that should not be taken into consideration in the indexing mechanism. Also, numbers should be added to the list as well as using the thesaurus system because it links different phrases or words with the same meaning to each other, which improves the indexing mechanism. The authors also invite researchers to add more pre-requisite texts to have better results.Originality/valueIn this paper, the authors present a full text-based auto-indexing method for Arabic text documents. The auto-indexing method extracts new relevant words by using data mining rules, which has not been investigated before. The method uses an association rule mining algorithm for extracting frequent sets containing related items to extract relationships between words in the texts to be indexed with words from texts that belong to the same category. The benefits of the method are demonstrated using empirical work involving several Arabic texts. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Library Hi Tech Emerald Publishing

Indexing Arabic texts using association rule data mining

Library Hi Tech , Volume 37 (1): 17 – Mar 7, 2019

Loading next page...
 
/lp/emerald-publishing/indexing-arabic-texts-using-association-rule-data-mining-uKKUCtBPAL

References (31)

Publisher
Emerald Publishing
Copyright
© Emerald Publishing Limited
ISSN
0737-8831
DOI
10.1108/lht-07-2017-0147
Publisher site
See Article on Publisher Site

Abstract

The purpose of this paper is to propose a new model to enhance auto-indexing Arabic texts. The model denotes extracting new relevant words by relating those chosen by previous classical methods to new words using data mining rules.Design/methodology/approachThe proposed model uses an association rule algorithm for extracting frequent sets containing related items – to extract relationships between words in the texts to be indexed with words from texts that belong to the same category. The associations of words extracted are illustrated as sets of words that appear frequently together.FindingsThe proposed methodology shows significant enhancement in terms of accuracy, efficiency and reliability when compared to previous works.Research limitations/implicationsThe stemming algorithm can be further enhanced. In the Arabic language, we have many grammatical rules. The more we integrate rules to the stemming algorithm, the better the stemming will be. Other enhancements can be done to the stop-list. This is by adding more words to it that should not be taken into consideration in the indexing mechanism. Also, numbers should be added to the list as well as using the thesaurus system because it links different phrases or words with the same meaning to each other, which improves the indexing mechanism. The authors also invite researchers to add more pre-requisite texts to have better results.Originality/valueIn this paper, the authors present a full text-based auto-indexing method for Arabic text documents. The auto-indexing method extracts new relevant words by using data mining rules, which has not been investigated before. The method uses an association rule mining algorithm for extracting frequent sets containing related items to extract relationships between words in the texts to be indexed with words from texts that belong to the same category. The benefits of the method are demonstrated using empirical work involving several Arabic texts.

Journal

Library Hi TechEmerald Publishing

Published: Mar 7, 2019

Keywords: Precision; Recall; Arabic text; Auto-indexing; Frequent sets; Rule-based data mining

There are no references for this article.