Speaker Separation Using Visual Speech Features and Single-channel Audio

被引：0

作者：

Khan, Faheem ^{[1
]}

Milner, Ben ^{[1
]}

机构：

[1] Univ East Anglia, Sch Comp Sci, Norwich, Norfolk, England

来源：

14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年

关键词：

Speaker separation; Wiener filter; visual features; audio-visual correlation; RECOGNITION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This work proposes a method of single-channel speaker separation that uses visual speech information to extract a target speaker's speech from a mixture of speakers. The method requires a single audio input and visual features extracted from the mouth region of each speaker in the mixture. The visual information from speakers is used to create a visually-derived Wiener filter. The Wiener filter gains are then non-linearly adjusted by a perceptual gain transform to improve the quality and intelligibility of the target speech. Experimental results are presented that estimate the quality and intelligibility of the extracted target speaker and a comparison is made of different perceptual gain transforms. These show that significant gains are achieved by the application of the perceptual gain function.

引用

页码：3263 / 3267

页数：5

共 50 条

[31] WHAMR!: NOISY AND REVERBERANT SINGLE-CHANNEL SPEECH SEPARATION
Maciejewski, Matthew
Wichern, Gordon
McQuinn, Emmett
Le Roux, Jonathan
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 696 - 700
[32] Speaker Independent Single Channel Source Separation Using Sinusoidal Features
Ranjan, Shivesh
Payton, Karen L.
Mowlaee, Pejman
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1522 - 1525
[33] LEARNING A HIERARCHICAL DICTIONARY FOR SINGLE-CHANNEL SPEECH SEPARATION
Bao, Guangzhao
Xu, Yangfei
Xu, Xu
Ye, Zhongfu
2014 IEEE WORKSHOP ON STATISTICAL SIGNAL PROCESSING (SSP), 2014, : 476 - 479
[34] Learning a Discriminative Dictionary for Single-Channel Speech Separation
Bao, Guangzhao
Xu, Yangfei
Ye, Zhongfu
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (07) : 1130 - 1138
[35] Improved Phase Reconstruction in Single-Channel Speech Separation
Mayer, Florian
Mowlaee, Pejman
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1795 - 1799
[36] Single-Channel Speech Separation Focusing on Attention DE
Li, Xinshu
Tan, Zhenhua
Xia, Zhenche
Wu, Danke
Zhang, Bin
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3204 - 3209
[37] Optimum Mixture Estimator for single-channel Speech Separation
Mowlaee, Pejman
Sayadiyan, Abolghassem
Sheikhan, Mansour
2008 INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS, VOLS 1 AND 2, 2008, : 543 - +
[38] Single-channel speech separation based on modulation frequency
Gu, Lingyun
Stern, Richard M.
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 25 - 28
[39] Multiband audio modeling for single-channel acoustic source separation
Reyes-Gomez, MJ
Ellis, DPW
Jojic, N
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 641 - 644
[40] Single-channel Music/Speech Separation Using Non-linear Masks
Mowlaee, P.
Sayadian, A.
Sheikhan, M.
Fallah, M.
2008 INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS, VOLS 1 AND 2, 2008, : 782 - +

← 1 2 3 4 5 →