Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

MARES: multitask learning algorithm for Web-scale real-time event summarization

MARES: multitask learning algorithm for Web-scale real-time event summarization Automatic real-time summarization of massive document streams on the Web has become an important tool for quickly transforming theoverwhelming documents into a novel, comprehensive and concise overview of an event for users. Significant progresses have been made in static text summarization. However, most previous work does not consider the temporal features of the document streams which are valuable in real-time event summarization. In this paper, we propose a novel M ultitask learning A lgorithm for Web-scale R eal-time E vent S ummarization (MARES), which leverages the benefits of supervised deep neural networks as well as a reinforcement learning algorithm to strengthen the representation learning of documents. Specifically, MARES consists two key components: (i) A relevance prediction classifier, in which a hierarchical LSTM model is used to learn the representations of queries and documents; (ii) A document filtering model learns to maximize the long-term rewards with reinforcement learning algorithm, working on a shared document encoding layer with the relevance prediction component. To verify the effectiveness of the proposed model, extensive experiments are conducted on two real-life document stream datasets: TREC Real-Time Summarization Track data and TREC Temporal Summarization Track data. The experimental results demonstrate that our model can achieve significantly better results than the state-of-the-art baseline methods. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png World Wide Web Springer Journals

MARES: multitask learning algorithm for Web-scale real-time event summarization

Loading next page...
 
/lp/springer_journal/mares-multitask-learning-algorithm-for-web-scale-real-time-event-zPJqNrk1R0
Publisher
Springer Journals
Copyright
Copyright © 2018 by Springer Science+Business Media, LLC, part of Springer Nature
Subject
Computer Science; Information Systems Applications (incl.Internet); Database Management; Operating Systems
ISSN
1386-145X
eISSN
1573-1413
DOI
10.1007/s11280-018-0597-7
Publisher site
See Article on Publisher Site

Abstract

Automatic real-time summarization of massive document streams on the Web has become an important tool for quickly transforming theoverwhelming documents into a novel, comprehensive and concise overview of an event for users. Significant progresses have been made in static text summarization. However, most previous work does not consider the temporal features of the document streams which are valuable in real-time event summarization. In this paper, we propose a novel M ultitask learning A lgorithm for Web-scale R eal-time E vent S ummarization (MARES), which leverages the benefits of supervised deep neural networks as well as a reinforcement learning algorithm to strengthen the representation learning of documents. Specifically, MARES consists two key components: (i) A relevance prediction classifier, in which a hierarchical LSTM model is used to learn the representations of queries and documents; (ii) A document filtering model learns to maximize the long-term rewards with reinforcement learning algorithm, working on a shared document encoding layer with the relevance prediction component. To verify the effectiveness of the proposed model, extensive experiments are conducted on two real-life document stream datasets: TREC Real-Time Summarization Track data and TREC Temporal Summarization Track data. The experimental results demonstrate that our model can achieve significantly better results than the state-of-the-art baseline methods.

Journal

World Wide WebSpringer Journals

Published: Jun 2, 2018

References