Informative subspaces for audio-visual processing: High-level function from low-level fusion

被引:0
|
作者
Fisher, JW [1 ]
Darrell, T [1 ]
机构
[1] MIT, Cambridge, MA 02139 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose anew probabilistic model of single source multi-modal generation, and show algorithms for maximizing mutual information which find correspondences between signal components. We show a nonparametric method for finding informative subspaces that captures complex statistical relationships between different modalities. We extend a previous subspace method to include new priors on the projection weights, yielding more robust results. Applied to human speakers, our model finds a relationship between audio speech and video of facial motion, and partially segments background events in both channels, We present new results on the problem of audio-visual verification, and show how the audio and video of a speaker can be matched without a prior model of the speaker's voice or appearance.
引用
收藏
页码:4104 / 4107
页数:2
相关论文
共 50 条
  • [1] A phonetically neutral model of the low-level audio-visual interaction
    Berthommier, F
    SPEECH COMMUNICATION, 2004, 44 (1-4) : 31 - 41
  • [2] High-Level Expectations for Low-Level Image Processing
    Hotz, Lothar
    Neumann, Bernd
    Terzic, Kasim
    KI 2008: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, 5243 : 87 - +
  • [3] Recognizing high-level audio-visual concepts using context
    Naphade, MR
    Huang, TS
    2001 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, 2001, : 46 - 49
  • [4] Detecting high-level and low-level properties in visual images and visual percepts
    Rouw, R
    Kosslyn, SM
    Hamel, R
    COGNITION, 1997, 63 (02) : 209 - 226
  • [5] Comparison of low- and high-level visual features for audio-visual continuous automatic speech recognition
    Aleksic, PS
    Katsaggelos, AK
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 917 - 920
  • [6] High-level decisions from low-level data
    Beers, SM
    SMC '97 CONFERENCE PROCEEDINGS - 1997 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: CONFERENCE THEME: COMPUTATIONAL CYBERNETICS AND SIMULATION, 1997, : 1948 - 1953
  • [7] Visual high-level regions respond to high-level stimulus content in the absence of low-level confounds
    Schindler, Andreas
    Bartels, Andreas
    NEUROIMAGE, 2016, 132 : 520 - 525
  • [8] From Low-Level Pointers to High-Level Containers
    Dudka, Kamil
    Holik, Lukas
    Peringer, Petr
    Trtik, Marek
    Vojnar, Tomas
    VERIFICATION, MODEL CHECKING, AND ABSTRACT INTERPRETATION, VMCAI 2016, 2016, 9583 : 431 - 452
  • [9] Low-level and high-level prior learning for visual saliency estimation
    Song, Mingli
    Chen, Chun
    Wang, Senlin
    Yang, Yezhou
    INFORMATION SCIENCES, 2014, 281 : 573 - 585
  • [10] In schizophrenia patients, high-level but not low-level motion processing is impaired
    Kandil, F.
    Pedersen, A.
    Ohrmann, P.
    PERCEPTION, 2012, 41 : 154 - 154