Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

An effective short text conceptualization based on new short text similarity

An effective short text conceptualization based on new short text similarity Recently short text messages, tweets, comments and so on, have become a large portion of the online text data. They are limited in length and different from traditional documents in their shortness and sparseness. As a result, short text tends to be ambiguous and its degree is not the same for all languages; and as Arabic is a very high flexional language, where a single word can have multiple meanings, the short text representation plays a vital role in any Text Mining task. To address these issues, we propose an efficient representation for short text based on concepts instead of terms using BabelNet as an external knowledge. However, in the conceptualization process, while searching polysemic term-corresponding concepts, multiple matches are detected. Therefore, assigning a term to a concept is a crucial step and we believe that short text similarity can be useful to overcome the problem of mapping term to the corresponding concept. In this paper, we reintroduce Web-based Kernel function for measuring the semantic relatedness between concepts to disambiguate an expression versus multiple concepts. The proposed method has been evaluated using an Arabic short text categorization system and the obtained results illustrate the interest of our contribution. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Social Network Analysis and Mining Springer Journals

An effective short text conceptualization based on new short text similarity

Loading next page...
 
/lp/springer-journals/an-effective-short-text-conceptualization-based-on-new-short-text-i4PB0wZgkz
Publisher
Springer Journals
Copyright
Copyright © 2018 by Springer-Verlag GmbH Austria, part of Springer Nature
Subject
Computer Science; Data Mining and Knowledge Discovery; Applications of Graph Theory and Complex Networks; Game Theory, Economics, Social and Behav. Sciences; Statistics for Social Sciences, Humanities, Law; Methodology of the Social Sciences
ISSN
1869-5450
eISSN
1869-5469
DOI
10.1007/s13278-018-0544-8
Publisher site
See Article on Publisher Site

Abstract

Recently short text messages, tweets, comments and so on, have become a large portion of the online text data. They are limited in length and different from traditional documents in their shortness and sparseness. As a result, short text tends to be ambiguous and its degree is not the same for all languages; and as Arabic is a very high flexional language, where a single word can have multiple meanings, the short text representation plays a vital role in any Text Mining task. To address these issues, we propose an efficient representation for short text based on concepts instead of terms using BabelNet as an external knowledge. However, in the conceptualization process, while searching polysemic term-corresponding concepts, multiple matches are detected. Therefore, assigning a term to a concept is a crucial step and we believe that short text similarity can be useful to overcome the problem of mapping term to the corresponding concept. In this paper, we reintroduce Web-based Kernel function for measuring the semantic relatedness between concepts to disambiguate an expression versus multiple concepts. The proposed method has been evaluated using an Arabic short text categorization system and the obtained results illustrate the interest of our contribution.

Journal

Social Network Analysis and MiningSpringer Journals

Published: Dec 3, 2018

References