Latent dirichlet decomposition for single channel speaker separation

被引:0
|
作者
Raj, Bhiksha [1 ]
Shashanka, Madhusudana V. S. [1 ]
Smaragdis, Paris [1 ]
机构
[1] Mitsubishi Elect Res Labs, Cambridge, MA 02139 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present an algorithm for the separation of multiple speakers from mixed single-channel recordings by latent variable decomposition of the speech spectrogram. We model each magnitude spectral vector in the short-time Fourier transform of a speech signal as the outcome of a discrete random process that generates frequency bin indices. The distribution of the process is modeled as a mixture of multinomial distributions, such that the mixture weights of the component multinomials vary from analysis window to analysis window. The component multinomials are assumed to be speaker specific and are learned from training signals for each speaker. We model the prior distribution of the mixture weights for each speaker as a Dirichlet distribution. The distributions representing magnitude spectral vectors for the mixed signal are decomposed into mixtures of the multinomials for all component speakers. The frequency distribution i.e the spectrum for each speaker is reconstructed from this decomposition.
引用
收藏
页码:5679 / 5682
页数:4
相关论文
共 50 条
  • [1] Latent variable decomposition of spectrograms for single channel speaker separation
    Raj, B
    Smaragdis, P
    [J]. 2005 WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2005, : 17 - 20
  • [2] Sparse overcomplete decomposition for single channel speaker separation
    Shashanka, Madhusudana V. S.
    Raj, Bhiksha
    Smaragdis, Paris
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 641 - +
  • [3] Speaker Verification Based on Single Channel Speech Separation
    Jin, Rong
    Ablimit, Mijit
    Hamdulla, Askar
    [J]. IEEE ACCESS, 2023, 11 : 112631 - 112638
  • [4] Soft mask methods for single-channel speaker separation
    Reddy, Aarthi M.
    Raj, Bhiksha
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (06): : 1766 - 1776
  • [5] JOINT SINGLE-CHANNEL SPEECH SEPARATION AND SPEAKER IDENTIFICATION
    Mowlaee, P.
    Saeidi, R.
    Tan, Z. -H.
    Christensen, M. G.
    Franti, P.
    Jensen, S. H.
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4430 - 4433
  • [6] Using audio and visual information for single channel speaker separation
    Khan, Faheem
    Milner, Ben
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1517 - 1521
  • [7] UNIVERSAL SPEECH MODELS FOR SPEAKER INDEPENDENT SINGLE CHANNEL SOURCE SEPARATION
    Sun, Dennis L.
    Mysore, Gautham J.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 141 - 145
  • [8] Speaker Counting and Separation From Single-Channel Noisy Mixtures
    Chetupalli, Srikanth Raj
    Habets, Emanuel A. P.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1681 - 1692
  • [9] Feasibility of single channel speaker separation based on modulation frequency analysis
    Schimmel, Steven M.
    Atlas, Les E.
    Nie, Kaibao
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 605 - +
  • [10] Speaker Independent Single Channel Source Separation Using Sinusoidal Features
    Ranjan, Shivesh
    Payton, Karen L.
    Mowlaee, Pejman
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1522 - 1525