Model-based monaural source separation using a vector-quantized phase-vocoder representation

被引:0
|
作者
Ellis, Daniel P. W. [1 ]
Weiss, Ron J. [1 ]
机构
[1] Columbia Univ, LabROSA, Dept Elect Engn, New York, NY 10027 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A vector quantizer (VQ) trained on short-time frames of a particular source can form an accurate non-parametric model of that source. This principle has been used in several previous source separation and enhancement schemes as a basis for filtering the original mixture. In this paper, we propose the "projection" of a corrupted target signal onto the constrained space represented by the model as a viable model for source separation. We investigate some parameters of VQ encoding, including a more perceptually-motivated distance measure, and an encoding of phase derivatives that. supports reconstruction directly from quantizer output alone. For the problem of separating speech from noise, we highlight some problems with this approach, including the need for sequential constraints (which we introduce with a simple hidden Markov model), and choices for choosing the best quantization for overlapping sources.
引用
收藏
页码:5815 / 5818
页数:4
相关论文
共 50 条
  • [1] New Distance Measure for Monaural Model-based Sound Separation
    Mahale, P. Mowlaee Begzade
    Sayadiyan, A.
    [J]. 2008 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES: FROM THEORY TO APPLICATIONS, VOLS 1-5, 2008, : 668 - 671
  • [2] Model-Based STFT Phase Recovery for Audio Source Separation
    Magron, Paul
    Badeau, Roland
    David, Bertrand
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (06) : 1091 - 1101
  • [3] CRANK: AN OPEN-SOURCE SOFTWARE FOR NONPARALLEL VOICE CONVERSION BASED ON VECTOR-QUANTIZED VARIATIONAL AUTOENCODER
    Kobayashi, Kazuhiro
    Huang, Wen-Chin
    Wu, Yi-Chiao
    Tobing, Patrick Lumban
    Hayashi, Tomoki
    Toda, Tomoki
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5934 - 5938
  • [4] Coded Speech Enhancement Using Neural Network-Based Vector-Quantized Residual Features
    Cheon, Youngju
    Hwang, Soojoong
    Han, Sangwook
    Jang, Inseon
    Shin, Jong Won
    [J]. INTERSPEECH 2021, 2021, : 1664 - 1668
  • [5] Complex ISNMF: A Phase-Aware Model for Monaural Audio Source Separation
    Magron, Paul
    Virtanen, Tuomas
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (01) : 20 - 31
  • [6] Agglomerative Hierarchical Clustering of Basis Vector for Monaural Sound Source Separation Based on NMF
    Murai, Kentaro
    Takeuchi, Taiho
    Tatekura, Yosuke
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1653 - 1657
  • [7] Vector-Quantized Space-Vector-Based Spread Spectrum Modulation Scheme for Multilevel Inverters Using the Principle of Oversampling ADC
    Jacob, Biji
    Baiju, M. R.
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2013, 60 (08) : 2969 - 2977
  • [8] Performance of SDMA systems using transmitter preprocessing based on noisy feedback of vector-quantized channel impulse responses
    Yang, Du
    Yang, Lie-Liang
    Hanzo, Lajos
    [J]. 2007 IEEE 65TH VEHICULAR TECHNOLOGY CONFERENCE, VOLS 1-6, 2007, : 2119 - 2123
  • [9] StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Model
    Xu, Zipeng
    Sangineto, Enver
    Sebe, Nicu
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 7567 - 7577
  • [10] Entropy-Based Estimation of Event-Related De/Synchronization in Motor Imagery Using Vector-Quantized Patterns
    Velasquez-Martinez, Luisa
    Caicedo-Acosta, Julian
    Castellanos-Dominguez, German
    [J]. ENTROPY, 2020, 22 (06)