Object category detection using audio-visual cues

被引:0
|
作者
Luo, Jie [1 ,2 ]
Caputo, Barbara [1 ,2 ]
Zweig, Alon [3 ]
Bach, Joerg-Hendrik [4 ]
Anemueller, Joern [4 ]
机构
[1] IDIAP Res Inst, Ctr Parc, CH-1920 Martigny, Switzerland
[2] Swiss Fed Inst Technol, Lausanne, Switzerland
[3] Hebrew Univ Jerusalem, Jerusalem, Israel
[4] Carl von Ossietzky Univ Oldenburg, Oldenburg, Germany
来源
关键词
object categorization; multimodal recognition; audio-visual fusion;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Categorization is one of the fundamental building blocks of cognitive systems. Object categorization has traditionally been addressed in the vision domain, even though cognitive agents are intrinsically multimodal. Indeed, biological systems combine several modalities in order to achieve robust categorization. In this paper we propose a multimodal approach to object category detection, using audio and visual information. The auditory channel is modeled on biologically motivated spectral features via a discriminative classifier. The visual channel is modeled by a state of the art part based model. Multimodality is achieved using two fusion schemes, one high level and the other low level. Experiments on six different object categories, under increasingly difficult conditions, show strengths and weaknesses of the two approaches, and clearly underline the open challenges for multimodal category detection.
引用
收藏
页码:539 / 548
页数:10
相关论文
共 50 条
  • [1] Vehicle Detection and Classification using Audio-Visual cues
    Piyush, P.
    Rajan, Rajeev
    Mary, Leena
    Koshy, Bino I.
    2016 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2016, : 732 - 736
  • [2] Human interaction categorization by using audio-visual cues
    Marin-Jimenez, M. J.
    Munoz-Salinas, R.
    Yeguas-Bolivar, E.
    Perez de la Blanca, N.
    MACHINE VISION AND APPLICATIONS, 2014, 25 (01) : 71 - 84
  • [3] Human interaction categorization by using audio-visual cues
    M. J. Marín-Jiménez
    R. Muñoz-Salinas
    E. Yeguas-Bolivar
    N. Pérez de la Blanca
    Machine Vision and Applications, 2014, 25 : 71 - 84
  • [4] EEG Guided Multimodal Lie Detection with Audio-Visual Cues
    Javaid, Hamza
    Dilawari, Aniqa
    Khan, Usman Ghani
    Wajid, Bilal
    PROCEEDINGS OF 2ND IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (ICAI 2022), 2022, : 71 - 78
  • [5] The impact of auditory, visual, and audio-visual sensory cues on multiple object tracking in children
    Atkins, Polly L.
    Hodgson, Timothy
    Dickinson, Patrick
    Hicks, Kieran
    Focker, Julia
    PERCEPTION, 2023, 52 (05) : 346 - 346
  • [6] A Robust Audio-visual Speech Recognition Using Audio-visual Voice Activity Detection
    Tamura, Satoshi
    Ishikawa, Masato
    Hashiba, Takashi
    Takeuchi, Shin'ichi
    Hayamizu, Satoru
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2702 - +
  • [7] Identifying Human Behaviors Using Synchronized Audio-Visual Cues
    Vrigkas, Michalis
    Nikou, Christophoros
    Kakadiaris, Ioannis A.
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2017, 8 (01) : 54 - 66
  • [8] Exploring the effectiveness of auditory, visual, and audio-visual sensory cues in a multiple object tracking environment
    Julia Föcker
    Polly Atkins
    Foivos-Christos Vantzos
    Maximilian Wilhelm
    Thomas Schenk
    Hauke S. Meyerhoff
    Attention, Perception, & Psychophysics, 2022, 84 : 1611 - 1624
  • [9] The Impact of Audio-Visual, Visual and Auditory Cues on Multiple Object Tracking Performance in Children with Autism
    Hughes, Lily
    Kargas, Niko
    Wilhelm, Maximilian
    Meyerhoff, Hauke S. S.
    Foecker, Julia
    PERCEPTUAL AND MOTOR SKILLS, 2023, 130 (05) : 2047 - 2068
  • [10] Exploring the effectiveness of auditory, visual, and audio-visual sensory cues in a multiple object tracking environment
    Foecker, Julia
    Atkins, Polly
    Vantzos, Foivos-Christos
    Wilhelm, Maximilian
    Schenk, Thomas
    Meyerhoff, Hauke S.
    ATTENTION PERCEPTION & PSYCHOPHYSICS, 2022, 84 (05) : 1611 - 1624