Latent dirichlet decomposition for single channel speaker separation

被引：0

作者：

Raj, Bhiksha ^{[1
]}

Shashanka, Madhusudana V. S. ^{[1
]}

Smaragdis, Paris ^{[1
]}

机构：

[1] Mitsubishi Elect Res Labs, Cambridge, MA 02139 USA

来源：

2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13 | 2006年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We present an algorithm for the separation of multiple speakers from mixed single-channel recordings by latent variable decomposition of the speech spectrogram. We model each magnitude spectral vector in the short-time Fourier transform of a speech signal as the outcome of a discrete random process that generates frequency bin indices. The distribution of the process is modeled as a mixture of multinomial distributions, such that the mixture weights of the component multinomials vary from analysis window to analysis window. The component multinomials are assumed to be speaker specific and are learned from training signals for each speaker. We model the prior distribution of the mixture weights for each speaker as a Dirichlet distribution. The distributions representing magnitude spectral vectors for the mixed signal are decomposed into mixtures of the multinomials for all component speakers. The frequency distribution i.e the spectrum for each speaker is reconstructed from this decomposition.

引用

页码：5679 / 5682

页数：4

共 50 条

[1] Latent variable decomposition of spectrograms for single channel speaker separation
Raj, B
Smaragdis, P
[J]. 2005 WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2005, : 17 - 20
[2] Sparse overcomplete decomposition for single channel speaker separation
Shashanka, Madhusudana V. S.
Raj, Bhiksha
Smaragdis, Paris
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 641 - +
[3] Speaker Verification Based on Single Channel Speech Separation
Jin, Rong
Ablimit, Mijit
Hamdulla, Askar
[J]. IEEE ACCESS, 2023, 11 : 112631 - 112638
[4] Soft mask methods for single-channel speaker separation
Reddy, Aarthi M.
Raj, Bhiksha
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (06): : 1766 - 1776
[5] JOINT SINGLE-CHANNEL SPEECH SEPARATION AND SPEAKER IDENTIFICATION
Mowlaee, P.
Saeidi, R.
Tan, Z. -H.
Christensen, M. G.
Franti, P.
Jensen, S. H.
[J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4430 - 4433
[6] Using audio and visual information for single channel speaker separation
Khan, Faheem
Milner, Ben
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1517 - 1521
[7] UNIVERSAL SPEECH MODELS FOR SPEAKER INDEPENDENT SINGLE CHANNEL SOURCE SEPARATION
Sun, Dennis L.
Mysore, Gautham J.
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 141 - 145
[8] Speaker Counting and Separation From Single-Channel Noisy Mixtures
Chetupalli, Srikanth Raj
Habets, Emanuel A. P.
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1681 - 1692
[9] Feasibility of single channel speaker separation based on modulation frequency analysis
Schimmel, Steven M.
Atlas, Les E.
Nie, Kaibao
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 605 - +
[10] Speaker Independent Single Channel Source Separation Using Sinusoidal Features
Ranjan, Shivesh
Payton, Karen L.
Mowlaee, Pejman
[J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1522 - 1525

← 1 2 3 4 5 →