Speaker Separation Using Visual Speech Features and Single-channel Audio

被引：0

作者：

Khan, Faheem ^{[1
]}

Milner, Ben ^{[1
]}

机构：

[1] Univ East Anglia, Sch Comp Sci, Norwich, Norfolk, England

来源：

14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年

关键词：

Speaker separation; Wiener filter; visual features; audio-visual correlation; RECOGNITION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This work proposes a method of single-channel speaker separation that uses visual speech information to extract a target speaker's speech from a mixture of speakers. The method requires a single audio input and visual features extracted from the mouth region of each speaker in the mixture. The visual information from speakers is used to create a visually-derived Wiener filter. The Wiener filter gains are then non-linearly adjusted by a perceptual gain transform to improve the quality and intelligibility of the target speech. Experimental results are presented that estimate the quality and intelligibility of the extracted target speaker and a comparison is made of different perceptual gain transforms. These show that significant gains are achieved by the application of the perceptual gain function.

引用

页码：3263 / 3267

页数：5

共 50 条

[41] Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement
Taherian, Hassan
Wang, Zhong-Qiu
Chang, Jorge
Wang, DeLiang
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1293 - 1302
[42] Deep Clustering in Complex Domain for Single-Channel Speech Separation
Liu, Runling
Tang, Yu
Mang, Hongwei
2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 1463 - 1468
[43] Single-channel Speech Separation based on Gaussian Process Regression
Le Dinh Nguyen
Chen, Sih-Huei
Tai, Tzu-Chiang
Wang, Jia-Ching
2018 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2018), 2018, : 275 - 278
[44] Ensemble System of Deep Neural Networks for Single-Channel Audio Separation
Al-Kaltakchi, Musab T. S.
Mohammad, Ahmad Saeed
Woo, Wai Lok
INFORMATION, 2023, 14 (07)
[45] A PITCH-AWARE APPROACH TO SINGLE-CHANNEL SPEECH SEPARATION
Wang, Ke
Soong, Frank
Xie, Lei
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 296 - 300
[46] Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training
Zhang, Peng
Xu, Jiaming
Shi, Jing
Hao, Yunzhe
Qin, Lei
Xu, Bo
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[47] A Single Channel Audio-Visual Fusion Speech Separation Method Based on DCNN and BiLSTM
Lan C.-F.
Wang S.-B.
Guo X.-X.
Han Y.-L.
Kang S.-Q.
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (04): : 914 - 921
[48] Phase estimation for signal reconstruction in single-channel speech separation
Mowlaee, Pejman
Saeidi, Rahim
Martin, Rainer
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1546 - 1549
[49] Sinusoidal Approach for the Single-Channel Speech Separation and Recognition Challenge
Mowlaee, P.
Saeidi, R.
Tan, Z. -H.
Christensen, M. G.
Kinnunen, T.
Franti, P.
Jensen, S. H.
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 684 - +
[50] Single-Channel Audio Sources Separation via Optimum Mask Filter
Fallah, Mahdi
Asgari, Meysam
UKSIM 2009: ELEVENTH INTERNATIONAL CONFERENCE ON COMPUTER MODELLING AND SIMULATION, 2009, : 228 - 232

← 1 2 3 4 5 →