Reconstruction techniques for improving the perceptual quality of binary masked speech

被引:29
|
作者
Williamson, Donald S. [1 ]
Wang, Yuxuan [1 ]
Wang, DeLiang [1 ,2 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Ohio State Univ, Ctr Cognit & Brain Sci, Columbus, OH 43210 USA
来源
关键词
NONNEGATIVE MATRIX FACTORIZATION; SPARSE REPRESENTATION; INTELLIGIBILITY; NOISE; ALGORITHM; FEATURES;
D O I
10.1121/1.4884759
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This study proposes an approach to improve the perceptual quality of speech separated by binary masking through the use of reconstruction in the time-frequency domain. Non-negative matrix factorization and sparse reconstruction approaches are investigated, both using a linear combination of basis vectors to represent a signal. In this approach, the short-time Fourier transform (STFT) of separated speech is represented as a linear combination of STFTs from a clean speech dictionary. Binary masking for separation is performed using deep neural networks or Bayesian classifiers. The perceptual evaluation of speech quality, which is a standard objective speech quality measure, is used to evaluate the performance of the proposed approach. The results show that the proposed techniques improve the perceptual quality of binary masked speech, and outperform traditional time-frequency reconstruction approaches. (C) 2014 Acoustical Society of America.
引用
下载
收藏
页码:892 / 902
页数:11
相关论文
共 50 条
  • [1] Reconstruction techniques for improving the perceptual quality of binary masked speech
    Williamson, D.S. (williado@cse.ohio-state.edu), 1600, Acoustical Society of America (136):
  • [2] IMPROVING THE PERCEPTUAL QUALITY OF IDEAL BINARY MASKED SPEECH
    Lightburn, Leo
    De Sena, Enzo
    Moore, Alastair
    Naylo, Patrick A.
    Brookes, Mike
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 661 - 665
  • [3] Perceptual restoration of masked speech in human cortex
    Matthew K. Leonard
    Maxime O. Baud
    Matthias J. Sjerps
    Edward F. Chang
    Nature Communications, 7
  • [4] Perceptual restoration of masked speech in human cortex
    Leonard, Matthew K.
    Baud, Maxime O.
    Sjerps, Matthias J.
    Chang, Edward F.
    NATURE COMMUNICATIONS, 2016, 7
  • [5] A TWO-STAGE APPROACH FOR IMPROVING THE PERCEPTUAL QUALITY OF SEPARATED SPEECH
    Williamson, Donald S.
    Wang, Yuxuan
    Wang, DeLiang
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [6] SPEECH INTELLIGIBILITY OF IDEAL BINARY MASKED MIXTURES
    Kjems, Ulrik
    Pedersen, Michael S.
    Boldt, Jesper B.
    Lunner, Thomas
    Wang, DeLiang
    18TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2010), 2010, : 1909 - 1913
  • [7] Improving Perceptual Quality by Phone-Fortified Perceptual Loss using Wasserstein Distance for Speech Enhancement
    Hsieh, Tsun-An
    Yu, Cheng
    Fu, Szu-Wei
    Lu, Xugang
    Tsao, Yu
    INTERSPEECH 2021, 2021, : 196 - 200
  • [8] Improving Perceptual Quality of Speech in a Noisy Environment by Enhancing Temporal Envelope and Pitch
    Park, Hochong
    Yoon, Jae-Yul
    Kim, Jung-Hoe
    Oh, Eunmi
    IEEE SIGNAL PROCESSING LETTERS, 2010, 17 (05) : 489 - 492
  • [9] Dissonant frequency filtering technique for improving perceptual quality of noisy speech and husky voice
    Kang, SK
    SIGNAL PROCESSING, 2004, 84 (02) : 431 - 433
  • [10] Perceptual speech quality assessment - A review
    Rix, AW
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 1056 - 1059