Evaluating multimedia features and fusion for example-based event detection

被引:0
|
作者
Gregory K. Myers
Ramesh Nallapati
Julien van Hout
Stephanie Pancoast
Ramakant Nevatia
Chen Sun
Amirhossein Habibian
Dennis C. Koelma
Koen E. A. van de Sande
Arnold W. M. Smeulders
Cees G. M. Snoek
机构
[1] SRI International (SRI),Institute for Robotics and Intelligent Systems
[2] University of Southern California (USC),undefined
[3] University of Amsterdam (UvA),undefined
[4] IBM Thomas J Watson Research Center,undefined
来源
关键词
Multimedia event detection; Video retrieval; Content extraction; Difference coding; Late fusion;
D O I
暂无
中图分类号
学科分类号
摘要
Multimedia event detection (MED) is a challenging problem because of the heterogeneous content and variable quality found in large collections of Internet videos. To study the value of multimedia features and fusion for representing and learning events from a set of example video clips, we created SESAME, a system for video SEarch with Speed and Accuracy for Multimedia Events. SESAME includes multiple bag-of-words event classifiers based on single data types: low-level visual, motion, and audio features; high-level semantic visual concepts; and automatic speech recognition. Event detection performance was evaluated for each event classifier. The performance of low-level visual and motion features was improved by the use of difference coding. The accuracy of the visual concepts was nearly as strong as that of the low-level visual features. Experiments with a number of fusion methods for combining the event detection scores from these classifiers revealed that simple fusion methods, such as arithmetic mean, perform as well as or better than other, more complex fusion methods. SESAME’s performance in the 2012 TRECVID MED evaluation was one of the best reported.
引用
收藏
页码:17 / 32
页数:15
相关论文
共 50 条
  • [1] Evaluating multimedia features and fusion for example-based event detection
    Myers, Gregory K.
    Nallapati, Ramesh
    van Hout, Julien
    Pancoast, Stephanie
    Nevatia, Ramakant
    Sun, Chen
    Habibian, Amirhossein
    Koelma, Dennis C.
    van de Sande, Koen E. A.
    Smeulders, Arnold W. M.
    Snoek, Cees G. M.
    MACHINE VISION AND APPLICATIONS, 2014, 25 (01) : 17 - 32
  • [2] Evaluating Realism in Example-based Terrain Synthesis
    Scott, Joshua J.
    Dodgson, Neil A.
    ACM TRANSACTIONS ON APPLIED PERCEPTION, 2022, 19 (03)
  • [3] Double Fusion for Multimedia Event Detection
    Lan, Zhen-zhong
    Bao, Lei
    Yu, Shoou-I
    Liu, Wei
    Hauptmann, Alexander G.
    ADVANCES IN MULTIMEDIA MODELING, 2012, 7131 : 173 - 185
  • [4] Example-based painting guided by color features
    Huang, Hua
    Zang, Yu
    Li, Chen-Feng
    VISUAL COMPUTER, 2010, 26 (6-8): : 933 - 942
  • [5] Example-based painting guided by color features
    Hua Huang
    Yu Zang
    Chen-Feng Li
    The Visual Computer, 2010, 26 : 933 - 942
  • [6] Example-based modeling for complex features of plant
    Xiao, Boxiang
    Guo, Xinyu
    Wen, Weiliang
    Lu, Shenglian
    Journal of Information and Computational Science, 2010, 7 (08): : 1705 - 1710
  • [7] Example-based object detection in images by components
    Mohan, A
    Papageorgiou, C
    Poggio, T
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (04) : 349 - 361
  • [8] Unknown Example Detection for Example-based Spoken Dialog System
    Takeuchi, Shota
    Kawanami, Hiromichi
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    ORIENTAL COCOSDA 2009 - INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2009, : 122 - 125
  • [9] Evaluating XAI: A comparison of rule-based and example-based explanations
    van der Waa, Jasper
    Nieuwburg, Elisabeth
    Cremers, Anita
    Neerincx, Mark
    ARTIFICIAL INTELLIGENCE, 2021, 291
  • [10] Robust Example Search Using Bottleneck Features for Example-based Speech Enhancement
    Ogawa, Atsunori
    Seki, Shogo
    Kinoshita, Keisuke
    Delcroix, Marc
    Yoshioka, Takuya
    Nakatani, Tomohiro
    Takeda, Kazuya
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3733 - 3737