Speech signals separation: A new approach exploiting the coherence of audio and visual speech

被引:1
|
作者
Girin, L [1 ]
Allard, A [1 ]
Schwartz, JL [1 ]
机构
[1] Univ Grenoble 3, Inst Commun Parlee, Speech Commun Lab, INPG,CNRS,UMR 5009, F-38040 Grenoble 9, France
关键词
D O I
10.1109/MMSP.2001.962803
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we present a new approach to the source separation problem in the case of multiple speech signals. The method is based on the use of automatic lipreading: the objective is to extract an acoustic speech signal from other acoustic signals by exploiting its coherence with the speaker's lip movements. For this aim, a statistical model is used to quantify this coherence. The results, while very preliminary, are encouraging. They show that this method can achieve a good separation of a speech source in the case of simple W additive mixtures. Moreover, it presents some Interesting complementarity with traditional pure audio techniques.
引用
收藏
页码:631 / 636
页数:6
相关论文
共 50 条
  • [1] Separation of audio-visual speech sources: A new approach exploiting the audio-visual coherence of speech stimuli
    Sodoyer, D
    Schwartz, JL
    Girin, L
    Klinkisch, J
    Jutten, C
    [J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2002, 2002 (11) : 1165 - 1173
  • [2] Separation of Audio-Visual Speech Sources: A New Approach Exploiting the Audio-Visual Coherence of Speech Stimuli
    David Sodoyer
    Jean-Luc Schwartz
    Laurent Girin
    Jacob Klinkisch
    Christian Jutten
    [J]. EURASIP Journal on Advances in Signal Processing, 2002
  • [3] A new approach to integrate audio and visual features of speech
    Pan, H
    Liang, ZP
    Huang, TS
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 1093 - 1096
  • [4] RAVSSNet: Recurrent Audio Visual Speech Separation
    Shankar, M. Chandan
    Nag, Hemanth
    Tripathi, Shikha
    [J]. INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 2, 2023, 543 : 557 - 567
  • [5] TIME DOMAIN AUDIO VISUAL SPEECH SEPARATION
    Wu, Jian
    Xu, Yong
    Zhang, Shi-Xiong
    Chen, Lian-Wu
    Yu, Meng
    Xie, Lei
    Yu, Dong
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 667 - 673
  • [6] Audio-Visual Deep Clustering for Speech Separation
    Lu, Rui
    Duan, Zhiyao
    Zhang, Changshui
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (11) : 1697 - 1712
  • [7] Bayesian separation of audio-visual speech sources
    Rajaram, S
    Nefian, AV
    Huang, TS
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 657 - 660
  • [8] Speech extraction based on ica and audio-visual coherence
    Sodoyer, D
    Girin, L
    Jutten, C
    Schwartz, JL
    [J]. SEVENTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOL 2, PROCEEDINGS, 2003, : 65 - 68
  • [9] DEEP AUDIO-VISUAL SPEECH SEPARATION WITH ATTENTION MECHANISM
    Li, Chenda
    Qian, Yanmin
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7314 - 7318
  • [10] Developing an audio-visual speech source separation algorithm
    Sodoyer, D
    Girin, L
    Jutten, C
    Schwartz, JL
    [J]. SPEECH COMMUNICATION, 2004, 44 (1-4) : 113 - 125