Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Dimensionality reduction for blog tag mining

Dimensionality reduction for blog tag mining Blog tags are labels of blog documents that classify them into different categories. Most tags are user-generated, which create problems such as inconsistencies in tags across different users, blogs without tags, lack of descriptive tags, lack of semantic distinction, etc. In this paper, we utilise dimensionality reduction techniques to reduce the inherent noise in blog tags. A tag-topic model is combined with dimensionality reduction, and then evaluated on real-world blog data. By employing dimensionality reduction techniques to reduce the document-tag space, better classification results were achieved. This indicates that the noise in tags can be effectively reduced by representing the original set of tags with a smaller number of latent tags, which can lead to more accurate real-time categorisation of blog documents. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Web Engineering and Technology Inderscience Publishers

Dimensionality reduction for blog tag mining

Loading next page...
 
/lp/inderscience-publishers/dimensionality-reduction-for-blog-tag-mining-UJSaBl0gz7
Publisher
Inderscience Publishers
Copyright
Copyright © Inderscience Enterprises Ltd. All rights reserved
ISSN
1476-1289
eISSN
1741-9212
DOI
10.1504/IJWET.2011.040726
Publisher site
See Article on Publisher Site

Abstract

Blog tags are labels of blog documents that classify them into different categories. Most tags are user-generated, which create problems such as inconsistencies in tags across different users, blogs without tags, lack of descriptive tags, lack of semantic distinction, etc. In this paper, we utilise dimensionality reduction techniques to reduce the inherent noise in blog tags. A tag-topic model is combined with dimensionality reduction, and then evaluated on real-world blog data. By employing dimensionality reduction techniques to reduce the document-tag space, better classification results were achieved. This indicates that the noise in tags can be effectively reduced by representing the original set of tags with a smaller number of latent tags, which can lead to more accurate real-time categorisation of blog documents.

Journal

International Journal of Web Engineering and TechnologyInderscience Publishers

Published: Jan 1, 2011

There are no references for this article.