Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

A Combination Approach to Web User Profiling

A Combination Approach to Web User Profiling A Combination Approach to Web User Pro ling JIE TANG Tsinghua University LIMIN YAO University of Massachusetts Amherst DUO ZHANG University of Illinois at Urbana-Champaign and JING ZHANG Tsinghua University In this article, we study the problem of Web user pro ling, which is aimed at nding, extracting, and fusing the œsemantic -based user pro le from the Web. Previously, Web user pro ling was often undertaken by creating a list of keywords for the user, which is (sometimes even highly) insuf cient for main applications. This article formalizes the pro ling problem as several subtasks: pro le extraction, pro le integration, and user interest discovery. We propose a combination approach to deal with the pro ling tasks. Speci cally, we employ a classi cation model to identify relevant documents for a user from the Web and propose a Tree-Structured Conditional Random Fields (TCRF) to extract the pro le information from the identi ed documents; we propose a uni ed probabilistic model to deal with the name ambiguity problem (several users with the same name) when integrating the pro le information extracted from different sources; nally, we use a probabilistic topic model to model the extracted user pro http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM Transactions on Knowledge Discovery from Data (TKDD) Association for Computing Machinery

Loading next page...
 
/lp/association-for-computing-machinery/a-combination-approach-to-web-user-profiling-YcOR0PH1bM

References (68)

Publisher
Association for Computing Machinery
Copyright
Copyright © 2010 by ACM Inc.
ISSN
1556-4681
DOI
10.1145/1870096.1870098
Publisher site
See Article on Publisher Site

Abstract

A Combination Approach to Web User Pro ling JIE TANG Tsinghua University LIMIN YAO University of Massachusetts Amherst DUO ZHANG University of Illinois at Urbana-Champaign and JING ZHANG Tsinghua University In this article, we study the problem of Web user pro ling, which is aimed at nding, extracting, and fusing the œsemantic -based user pro le from the Web. Previously, Web user pro ling was often undertaken by creating a list of keywords for the user, which is (sometimes even highly) insuf cient for main applications. This article formalizes the pro ling problem as several subtasks: pro le extraction, pro le integration, and user interest discovery. We propose a combination approach to deal with the pro ling tasks. Speci cally, we employ a classi cation model to identify relevant documents for a user from the Web and propose a Tree-Structured Conditional Random Fields (TCRF) to extract the pro le information from the identi ed documents; we propose a uni ed probabilistic model to deal with the name ambiguity problem (several users with the same name) when integrating the pro le information extracted from different sources; nally, we use a probabilistic topic model to model the extracted user pro

Journal

ACM Transactions on Knowledge Discovery from Data (TKDD)Association for Computing Machinery

Published: Dec 1, 2010

There are no references for this article.