Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Two phase estimation method for multi-classifying real life tweets

Two phase estimation method for multi-classifying real life tweets Purpose – This paper aims to propose a multi-label method that estimates appropriate aspects against unknown tweets using the two-phase estimation method. Many Twitter users share daily events and opinions. Some beneficial comments are posted on such real-life aspects as eating, traffic, weather and so on. Such posts as “The train is not coming” are categorized in the Traffic aspect. Such tweets as “The train is delayed by heavy rain” are categorized in both the Traffic and Weather aspects. Design/methodology/approach – The proposed method consists of two phases. In the first, many topics are extracted from a sea of tweets using Latent Dirichlet Allocation (LDA). In the second, associations among many topics and fewer aspects are built using a small set of labeled tweets. The aspect scores for tweets were calculated using associations based on the extracted terms. Appropriate aspects are labeled for unknown tweets by averaging the aspect scores. Findings – Using a large amount of actual tweets, the sophisticated experimental evaluations demonstrate the high efficiency of the proposed multi-label classification method. It is confirmed that high F -measure aspects are strongly associated with topics that have high relevance. Low F -measure aspects are associated with topics that are connected to many other aspects. Originality/value – The proposed method features two-phase semi-supervised learning. Many topics are extracted using an unsupervised learning model called LDA. Associations among many topics and fewer aspects are built using labeled tweets. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Web Information Systems Emerald Publishing

Two phase estimation method for multi-classifying real life tweets

Loading next page...
 
/lp/emerald-publishing/two-phase-estimation-method-for-multi-classifying-real-life-tweets-HMJ6mUpzcO
Publisher
Emerald Publishing
Copyright
Copyright © Emerald Group Publishing Limited
ISSN
1744-0084
DOI
10.1108/IJWIS-04-2014-0013
Publisher site
See Article on Publisher Site

Abstract

Purpose – This paper aims to propose a multi-label method that estimates appropriate aspects against unknown tweets using the two-phase estimation method. Many Twitter users share daily events and opinions. Some beneficial comments are posted on such real-life aspects as eating, traffic, weather and so on. Such posts as “The train is not coming” are categorized in the Traffic aspect. Such tweets as “The train is delayed by heavy rain” are categorized in both the Traffic and Weather aspects. Design/methodology/approach – The proposed method consists of two phases. In the first, many topics are extracted from a sea of tweets using Latent Dirichlet Allocation (LDA). In the second, associations among many topics and fewer aspects are built using a small set of labeled tweets. The aspect scores for tweets were calculated using associations based on the extracted terms. Appropriate aspects are labeled for unknown tweets by averaging the aspect scores. Findings – Using a large amount of actual tweets, the sophisticated experimental evaluations demonstrate the high efficiency of the proposed multi-label classification method. It is confirmed that high F -measure aspects are strongly associated with topics that have high relevance. Low F -measure aspects are associated with topics that are connected to many other aspects. Originality/value – The proposed method features two-phase semi-supervised learning. Many topics are extracted using an unsupervised learning model called LDA. Associations among many topics and fewer aspects are built using labeled tweets.

Journal

International Journal of Web Information SystemsEmerald Publishing

Published: Nov 11, 2014

There are no references for this article.