Speech signals separation: A new approach exploiting the coherence of audio and visual speech

被引：1

作者：

Girin, L ^{[1
]}

Allard, A ^{[1
]}

Schwartz, JL ^{[1
]}

机构：

[1] Univ Grenoble 3, Inst Commun Parlee, Speech Commun Lab, INPG,CNRS,UMR 5009, F-38040 Grenoble 9, France

来源：

2001 IEEE FOURTH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING | 2001年

关键词：

D O I：

10.1109/MMSP.2001.962803

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In this paper, we present a new approach to the source separation problem in the case of multiple speech signals. The method is based on the use of automatic lipreading: the objective is to extract an acoustic speech signal from other acoustic signals by exploiting its coherence with the speaker's lip movements. For this aim, a statistical model is used to quantify this coherence. The results, while very preliminary, are encouraging. They show that this method can achieve a good separation of a speech source in the case of simple W additive mixtures. Moreover, it presents some Interesting complementarity with traditional pure audio techniques.

引用

页码：631 / 636

页数：6

共 50 条

[1] Separation of audio-visual speech sources: A new approach exploiting the audio-visual coherence of speech stimuli
Sodoyer, D
Schwartz, JL
Girin, L
Klinkisch, J
Jutten, C
[J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2002, 2002 (11) : 1165 - 1173
[2] Separation of Audio-Visual Speech Sources: A New Approach Exploiting the Audio-Visual Coherence of Speech Stimuli
David Sodoyer
Jean-Luc Schwartz
Laurent Girin
Jacob Klinkisch
Christian Jutten
[J]. EURASIP Journal on Advances in Signal Processing, 2002
[3] A new approach to integrate audio and visual features of speech
Pan, H
Liang, ZP
Huang, TS
[J]. 2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 1093 - 1096
[4] RAVSSNet: Recurrent Audio Visual Speech Separation
Shankar, M. Chandan
Nag, Hemanth
Tripathi, Shikha
[J]. INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 2, 2023, 543 : 557 - 567
[5] TIME DOMAIN AUDIO VISUAL SPEECH SEPARATION
Wu, Jian
Xu, Yong
Zhang, Shi-Xiong
Chen, Lian-Wu
Yu, Meng
Xie, Lei
Yu, Dong
[J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 667 - 673
[6] Audio-Visual Deep Clustering for Speech Separation
Lu, Rui
Duan, Zhiyao
Zhang, Changshui
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (11) : 1697 - 1712
[7] Bayesian separation of audio-visual speech sources
Rajaram, S
Nefian, AV
Huang, TS
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 657 - 660
[8] Speech extraction based on ica and audio-visual coherence
Sodoyer, D
Girin, L
Jutten, C
Schwartz, JL
[J]. SEVENTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOL 2, PROCEEDINGS, 2003, : 65 - 68
[9] DEEP AUDIO-VISUAL SPEECH SEPARATION WITH ATTENTION MECHANISM
Li, Chenda
Qian, Yanmin
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7314 - 7318
[10] Developing an audio-visual speech source separation algorithm
Sodoyer, D
Girin, L
Jutten, C
Schwartz, JL
[J]. SPEECH COMMUNICATION, 2004, 44 (1-4) : 113 - 125

← 1 2 3 4 5 →