Speaker Separation Using Visual Speech Features and Single-channel Audio

被引:0
|
作者
Khan, Faheem [1 ]
Milner, Ben [1 ]
机构
[1] Univ East Anglia, Sch Comp Sci, Norwich, Norfolk, England
关键词
Speaker separation; Wiener filter; visual features; audio-visual correlation; RECOGNITION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work proposes a method of single-channel speaker separation that uses visual speech information to extract a target speaker's speech from a mixture of speakers. The method requires a single audio input and visual features extracted from the mouth region of each speaker in the mixture. The visual information from speakers is used to create a visually-derived Wiener filter. The Wiener filter gains are then non-linearly adjusted by a perceptual gain transform to improve the quality and intelligibility of the target speech. Experimental results are presented that estimate the quality and intelligibility of the extracted target speaker and a comparison is made of different perceptual gain transforms. These show that significant gains are achieved by the application of the perceptual gain function.
引用
收藏
页码:3263 / 3267
页数:5
相关论文
共 50 条
  • [41] Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement
    Taherian, Hassan
    Wang, Zhong-Qiu
    Chang, Jorge
    Wang, DeLiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1293 - 1302
  • [42] Deep Clustering in Complex Domain for Single-Channel Speech Separation
    Liu, Runling
    Tang, Yu
    Mang, Hongwei
    2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 1463 - 1468
  • [43] Single-channel Speech Separation based on Gaussian Process Regression
    Le Dinh Nguyen
    Chen, Sih-Huei
    Tai, Tzu-Chiang
    Wang, Jia-Ching
    2018 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2018), 2018, : 275 - 278
  • [44] Ensemble System of Deep Neural Networks for Single-Channel Audio Separation
    Al-Kaltakchi, Musab T. S.
    Mohammad, Ahmad Saeed
    Woo, Wai Lok
    INFORMATION, 2023, 14 (07)
  • [45] A PITCH-AWARE APPROACH TO SINGLE-CHANNEL SPEECH SEPARATION
    Wang, Ke
    Soong, Frank
    Xie, Lei
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 296 - 300
  • [46] Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training
    Zhang, Peng
    Xu, Jiaming
    Shi, Jing
    Hao, Yunzhe
    Qin, Lei
    Xu, Bo
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [47] A Single Channel Audio-Visual Fusion Speech Separation Method Based on DCNN and BiLSTM
    Lan C.-F.
    Wang S.-B.
    Guo X.-X.
    Han Y.-L.
    Kang S.-Q.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (04): : 914 - 921
  • [48] Phase estimation for signal reconstruction in single-channel speech separation
    Mowlaee, Pejman
    Saeidi, Rahim
    Martin, Rainer
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1546 - 1549
  • [49] Sinusoidal Approach for the Single-Channel Speech Separation and Recognition Challenge
    Mowlaee, P.
    Saeidi, R.
    Tan, Z. -H.
    Christensen, M. G.
    Kinnunen, T.
    Franti, P.
    Jensen, S. H.
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 684 - +
  • [50] Single-Channel Audio Sources Separation via Optimum Mask Filter
    Fallah, Mahdi
    Asgari, Meysam
    UKSIM 2009: ELEVENTH INTERNATIONAL CONFERENCE ON COMPUTER MODELLING AND SIMULATION, 2009, : 228 - 232