Robust Event Detection From Spoken Content In Consumer Domain Videos

被引:0
|
作者
Tsakalidis, Stavros [1 ]
Zhuang, Xiaodan [1 ]
Hsiao, Roger [1 ]
Wu, Shuang [1 ]
Natarajan, Pradeep [1 ]
Prasad, Rohit [1 ]
Natarajan, Prem [1 ]
机构
[1] Raytheon BBN Technol, Cambridge, MA 02138 USA
关键词
multimedia event detection; keyword selection; keyword expansion; keyword spotting; SPEECH RECOGNITION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose an innovative integrated approach to leverage available spoken content while detecting events in consumer-generated multimedia data (i.e., You Tube videos). Spoken content in consumer videos exhibits several challenges. For example, unlike Broadcast News, the spoken audio is typically not labeled. Also, the audio track in consumer videos tends to be noisy and the spoken content is often sporadic. Here, we describe three recent improvements that are specifically targeted at overcoming the challenges in consumer videos: robust data-driven keyword selection, automatic discovery of word-classes for keyword expansion, and a keyword spotting approach for improving recall in noisy conditions. These improvements are integrated into the audio analysis component of the BBN VISER system. The VISER system embodies a state-of-the-art approach as substantiated by its performance on the 2011 TRECVID MED task. Experimental results on the 2011 TRECVID MED task clearly demonstrate the effectiveness of the three improvements.
引用
收藏
页码:2099 / 2102
页数:4
相关论文
共 50 条
  • [1] Robust Audio-Codebooks for Large-Scale Event Detection in Consumer Videos
    Rawat, Shourabh
    Schulam, Peter F.
    Burger, Susanne
    Ding, Duo
    Wang, Yipei
    Metze, Florian
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2928 - 2932
  • [2] Retrieval of Player Event in Golf Videos Using Spoken Content Analysis
    Kim, Hyoung-Gook
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2009, 28 (07): : 674 - 679
  • [3] Event detection in consumer videos using GMM supervectors and SVMs
    Kamishima, Yusuke
    Inoue, Nakamasa
    Shinoda, Koichi
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2013,
  • [4] Event detection in consumer videos using GMM supervectors and SVMs
    Yusuke Kamishima
    Nakamasa Inoue
    Koichi Shinoda
    [J]. EURASIP Journal on Image and Video Processing, 2013
  • [5] Multimodal Feature Fusion for Robust Event Detection in Web Videos
    Natarajan, Pradeep
    Wu, Shuang
    Vitaladevuni, Shiv
    Zhuang, Xiaodan
    Tsakalidis, Stavros
    Park, Unsang
    Prasad, Rohit
    Natarajan, Premkumar
    [J]. 2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 1298 - 1305
  • [6] Robust Candidate Frame Detection in Videos using Semantic Content Modeling
    Manonmani, T.
    Mala, K.
    [J]. 2014 INTERNATIONAL CONFERENCE ON COMMUNICATION AND NETWORK TECHNOLOGIES (ICCNT), 2014, : 281 - 285
  • [7] Exploiting Web Images for Event Recognition in Consumer Videos: A Multiple Source Domain Adaptation Approach
    Duan, Lixin
    Xu, Dong
    Chang, Shih-Fu
    [J]. 2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 1338 - 1345
  • [8] Event detection in crowded videos
    Ke, Yan
    Sukthankar, Rahul
    Hebert, Martial
    [J]. 2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, : 1424 - 1431
  • [9] Unsupervised event detection in videos
    Mustafa, Ali
    Sethi, Ishwar
    [J]. 19TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL II, PROCEEDINGS, 2007, : 179 - +
  • [10] Affective content analysis in comedy and horror videos by audio emotional event detection
    Xu, M
    Chia, LT
    Jin, J
    [J]. 2005 IEEE International Conference on Multimedia and Expo (ICME), Vols 1 and 2, 2005, : 622 - 625