Speaker Separation Using Visual Speech Features and Single-channel Audio

被引:0
|
作者
Khan, Faheem [1 ]
Milner, Ben [1 ]
机构
[1] Univ East Anglia, Sch Comp Sci, Norwich, Norfolk, England
关键词
Speaker separation; Wiener filter; visual features; audio-visual correlation; RECOGNITION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work proposes a method of single-channel speaker separation that uses visual speech information to extract a target speaker's speech from a mixture of speakers. The method requires a single audio input and visual features extracted from the mouth region of each speaker in the mixture. The visual information from speakers is used to create a visually-derived Wiener filter. The Wiener filter gains are then non-linearly adjusted by a perceptual gain transform to improve the quality and intelligibility of the target speech. Experimental results are presented that estimate the quality and intelligibility of the extracted target speaker and a comparison is made of different perceptual gain transforms. These show that significant gains are achieved by the application of the perceptual gain function.
引用
收藏
页码:3263 / 3267
页数:5
相关论文
共 50 条
  • [31] WHAMR!: NOISY AND REVERBERANT SINGLE-CHANNEL SPEECH SEPARATION
    Maciejewski, Matthew
    Wichern, Gordon
    McQuinn, Emmett
    Le Roux, Jonathan
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 696 - 700
  • [32] Speaker Independent Single Channel Source Separation Using Sinusoidal Features
    Ranjan, Shivesh
    Payton, Karen L.
    Mowlaee, Pejman
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1522 - 1525
  • [33] LEARNING A HIERARCHICAL DICTIONARY FOR SINGLE-CHANNEL SPEECH SEPARATION
    Bao, Guangzhao
    Xu, Yangfei
    Xu, Xu
    Ye, Zhongfu
    2014 IEEE WORKSHOP ON STATISTICAL SIGNAL PROCESSING (SSP), 2014, : 476 - 479
  • [34] Learning a Discriminative Dictionary for Single-Channel Speech Separation
    Bao, Guangzhao
    Xu, Yangfei
    Ye, Zhongfu
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (07) : 1130 - 1138
  • [35] Improved Phase Reconstruction in Single-Channel Speech Separation
    Mayer, Florian
    Mowlaee, Pejman
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1795 - 1799
  • [36] Single-Channel Speech Separation Focusing on Attention DE
    Li, Xinshu
    Tan, Zhenhua
    Xia, Zhenche
    Wu, Danke
    Zhang, Bin
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3204 - 3209
  • [37] Optimum Mixture Estimator for single-channel Speech Separation
    Mowlaee, Pejman
    Sayadiyan, Abolghassem
    Sheikhan, Mansour
    2008 INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS, VOLS 1 AND 2, 2008, : 543 - +
  • [38] Single-channel speech separation based on modulation frequency
    Gu, Lingyun
    Stern, Richard M.
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 25 - 28
  • [39] Multiband audio modeling for single-channel acoustic source separation
    Reyes-Gomez, MJ
    Ellis, DPW
    Jojic, N
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 641 - 644
  • [40] Single-channel Music/Speech Separation Using Non-linear Masks
    Mowlaee, P.
    Sayadian, A.
    Sheikhan, M.
    Fallah, M.
    2008 INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS, VOLS 1 AND 2, 2008, : 782 - +