Access the full text.
Sign up today, get DeepDyve free for 14 days.
Camille Goudeseune (2012)
Effective browsing of long audio recordings
N. Otsu (1979)
A Threshold Selection Method from Gray-Level HistogramsIEEE Trans. Syst. Man Cybern., 9
K. Abouchacra, T. Letowski, Timothy Mermagen (2007)
Detection and Localization of Magazine Insertion Clicks in Various Environmental NoisesMilitary Psychology, 19
H. Wechsler (1981)
Digital image processing, 2nd ed.Proceedings of the IEEE, 69
(1979)
Information Retrieval (2nd ed.)
Article , Publication date: January YY. Saliency-maximized Audio Visualization and Efficient Audio-visual Browsing @BULLET
(2013)
Received January
G. Melançon, T. Munzner, D. Weiskopf, H. Jänicke, M. Chen
Eurographics/ Ieee-vgtc Symposium on Visualization 2010 a Salience-based Quality Metric for Visualization
G. Miller (1956)
The magical number seven plus or minus two: some limits on our capacity for processing information.Psychological review, 63 2
Andrey Temko, Robert Malkin, C. Zieger, Dusan Macho, C. Nadeu (2006)
ACOUSTIC EVENT DETECTION AND CLASSIFICATION IN SMART-ROOM ENVIRONMENTS: EVALUATION OF CHIL PROJECT SYSTEMS
M. Berry, M. Browne, A. Langville, V. Pauca, R. Plemmons (2007)
Algorithms and applications for approximate nonnegative matrix factorizationComput. Stat. Data Anal., 52
Gianni Amati (2009)
Information Retrieval
(2004)
Non-negative matrix factor deconvolution; extraction of multiple sound sources from monophonic inputs Independent Component Analysis and Blind Signal Separation
L. Itti, C. Koch (2001)
Feature combination strategies for saliency-based visual attention systemsJ. Electronic Imaging, 10
C. Wickens (1992)
Engineering psychology and human performance, 2nd ed.
J. Smith, H. Ashurst, S. Jack, A. Woodcock, J. Earis (2006)
The description of cough sounds by healthcare professionalsCough (London, England), 2
T. Cover, Joy Thomas (2005)
Elements of Information Theory
R. Gonzales, P. Wintz (1987)
Digital image processing (2nd ed.)
Xi Zhou, Xiaodan Zhuang, Ming Liu, Hao Tang, Mark Hasegawa-Johnson, Thomas Huang (2007)
HMM-Based Acoustic Event Detection with AdaBoost Feature Selection
Jin Shin, Sang Kim (2006)
A Mathematical Theory of Communication
(2007)
CLEAR 2007 AED evaluation plan and workshop
ACM Transactions on Applied Perception
L. Itti, C. Koch, E. Niebur (1998)
A Model of Saliency-Based Visual Attention for Rapid Scene AnalysisIEEE Trans. Pattern Anal. Mach. Intell., 20
(2007)
AED evaluation plan and workshop
A. Borji, L. Itti (2013)
State-of-the-Art in Visual Attention ModelingIEEE Transactions on Pattern Analysis and Machine Intelligence, 35
B. Arons (1997)
SpeechSkimmer: a system for interactively skimming recorded speechACM Trans. Comput. Hum. Interact., 4
J. Carletta (2007)
Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting CorpusLanguage Resources and Evaluation, 41
P. Smaragdis (2004)
Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs
Daniel Lee, H. Seung (1999)
Learning the parts of objects by non-negative matrix factorizationNature, 401
Article , Publication date: January YY
S. Frintrop, Erich Rome, H. Christensen (2010)
Computational visual attention systems and their cognitive foundations: A surveyACM Trans. Appl. Percept., 7
C. James, Kim James, E. Goldstein (1980)
Sensation and perception
J. Smith, J. Earis, A. Woodcock (2006)
Establishing a gold standard for manual cough counting: video versus digital audio recordingsCough (London, England), 2
M. Hasegawa-Johnson, Camille Goudeseune, J. Cole, H. Kaczmarski, Heejin Kim, Sarah King, Tim Mahrt, J. Huang, Xiaodan Zhuang, Kai-Hsiang Lin, Harsh Sharma, Z. Li, Thomas Huang (2011)
Multimodal speech and audio user interfaces for K-12 outreach
John Anderson (1980)
Cognitive Psychology and Its Implications
Dirk Walther, C. Koch (2006)
Modeling attention to salient proto-objectsNeural networks : the official journal of the International Neural Network Society, 19 9
R. Rosenholtz, Amal Dorai, R. Freeman (2011)
Do predictions of visual perception aid design?ACM Trans. Appl. Percept., 8
A. Belopolsky, Arthur Kramer, R. Godijn, Artem Kramer, Arthur Godijn, Richard
Please Scroll down for Article Visual Cognition Transfer of Information into Working Memory during Attentional Capture
A. Bovik, J. Adams (2009)
The Essential Guide to Image ProcessingJ. Electronic Imaging, 19
C. Wickens, J. Hollands, S. Banbury, R. Parasuraman (2021)
Engineering Psychology and Human Performance
P. Smaragdis, Judith Brown, Judith Brown (2003)
Non-negative matrix factorization for polyphonic music transcription2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684)
J. Lira (1990)
Two dimensional signal and image processing
Kai-Hsiang Lin, Xiaodan Zhuang, Camille Goudeseune, Sarah King, M. Hasegawa-Johnson, Thomas Huang (2012)
Improving faster-than-real-time human acoustic event detection by saliency-maximized audio visualization2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Saliency-Maximized Audio Visualization and Efficient Audio-Visual Browsing for Faster-Than-Real-Time Human Acoustic Event Detection KAI-HSIANG LIN, XIAODAN ZHUANG , CAMILLE GOUDESEUNE, SARAH KING, MARK HASEGAWA-JOHNSON, and THOMAS S. HUANG, University of Illinois at Urbana-Champaign Browsing large audio archives is challenging because of the limitations of human audition and attention. However, this task becomes easier with a suitable visualization of the audio signal, such as a spectrogram transformed to make unusual audio events salient. This transformation maximizes the mutual information between an isolated event's spectrogram and an estimate of how salient the event appears in its surrounding context. When such spectrograms are computed and displayed with fluid zooming over many temporal orders of magnitude, sparse events in long audio recordings can be detected more quickly and more easily. In particular, in a 1/10-real-time acoustic event detection task, subjects who were shown saliency-maximized rather than conventional spectrograms performed significantly better. Saliency maximization also improves the mutual information between the ground truth of nonbackground sounds and visual saliency, more than other common enhancements to visualization. Categories and Subject Descriptors: H.5.2 [Information Interfaces and Presentation]: User Interfaces--Theory and methods, Evaluation/methodology; H.1.2 [Models and Principles]: User/Machine Systems--Human Information Processing; H.5.1 [Information Interfaces and Presentation]:
ACM Transactions on Applied Perception (TAP) – Association for Computing Machinery
Published: Oct 1, 2013
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.