Advances in phase-aware signal processing in speech communication

被引:100
|
作者
Mowlaee, Pejman [1 ]
Saeidi, Rahim [2 ]
Stylianou, Yannis [3 ]
机构
[1] Graz Univ Technol, Signal Proc & Speech Commun Lab, Graz, Austria
[2] Aalto Univ, Dept Signal Proc & Acoust, Espoo, Finland
[3] Univ Crete, Dept Comp Sci, Iraklion, Greece
基金
奥地利科学基金会; 芬兰科学院;
关键词
Phase-aware speech processing; Phase-based features; Signal enhancement; Automatic speech recognition; Speaker recognition; Speech synthesis; Speech coding; Speech analysis; GROUP DELAY FUNCTIONS; SPECTRAL MAGNITUDE ESTIMATION; INTELLIGIBILITY PREDICTION; INSTANTANEOUS FREQUENCY; SOURCE SEPARATION; FOURIER SPECTRUM; MFCC FEATURES; ENHANCEMENT; RECONSTRUCTION; INFORMATION;
D O I
10.1016/j.specom.2016.04.002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
During the past three decades, the issue of processing spectral phase has been largely neglected in speech applications. There is no doubt that the interest of speech processing community towards the use of phase information in a big spectrum of speech technologies, from automatic speech and speaker recognition to speech synthesis, from speech enhancement and source separation to speech coding, is constantly increasing. In this paper, we elaborate on why phase was believed to be unimportant in each application. We provide an overview of advancements in phase-aware signal processing with applications to speech, showing that considering phase-aware speech processing can be beneficial in many cases, while it can complement the possible solutions that magnitude-only methods suggest. Our goal is to show that phase-aware signal processing is an important emerging field with high potential in the current speech communication applications. The paper provides an extended and up-to-date bibliography on the topic of phase aware speech processing aiming at providing the necessary background to the interested readers for following the recent advancements in the area. Our review expands the step initiated by our organized special session and exemplifies the usefulness of spectral phase information in a wide range of speech processing applications. Finally, the overview will provide some future work directions. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 29
页数:29
相关论文
共 50 条
  • [1] Phase-Aware Signal Processing for Automatic Speech Recognition
    Fahringer, Johannes
    Schrank, Tobias
    Stahl, Johannes
    Mowlaee, Pejman
    Pernkopf, Franz
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3374 - 3378
  • [2] Investigation on the Band Importance of Phase-aware Speech Enhancement
    Zhang, Zhuohuang
    Williamson, Donald S.
    Shen, Yi
    INTERSPEECH 2022, 2022, : 4651 - 4655
  • [3] Phase-Aware Single-channel Speech Enhancement
    Mowlaee, Pejman
    Watanabe, Mario Kaoru
    Saeidi, Rahim
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1871 - 1873
  • [4] Phase-Aware Speech Enhancement With Complex Wiener Filter
    Nguyen, Huy
    Ho, Tuan Vu
    Akagi, Masato
    Unoki, Masashi
    IEEE ACCESS, 2023, 11 : 141573 - 141584
  • [5] Improved single channel phase-aware speech enhancement technique for low signal-to-noise ratio signal
    Samui, Suman
    Chakrabarti, Indrajit
    Ghosh, Soumya Kanti
    IET SIGNAL PROCESSING, 2016, 10 (06) : 641 - 650
  • [6] On Speech Intelligibility Estimation of Phase-Aware Single-Channel Speech Enhancement
    Gaich, Andreas
    Mowlaee, Pejman
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2553 - 2557
  • [7] Phase-Aware Speech Enhancement Based on Deep Neural Networks
    Zheng, Naijun
    Zhang, Xiao-Lei
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (01) : 63 - 76
  • [8] Phase-aware subspace decomposition for single channel speech separation
    Wiem, Belhedi
    Mohamed Anouar, Ben Messaoud
    Aicha, Bouzid
    IET SIGNAL PROCESSING, 2020, 14 (04) : 214 - 222
  • [9] PACDNN: A phase-aware composite deep neural network for speech enhancement
    Hasannezhad, Mojtaba
    Yu, Hongjiang
    Zhu, Wei-Ping
    Champagne, Benoit
    SPEECH COMMUNICATION, 2022, 136 : 1 - 13
  • [10] Vector-quantized Variational Autoencoder for Phase-aware Speech Enhancement
    Tuan Vu Ho
    Quoc Huy Nguyen
    Akagi, Masato
    Unoki, Masashi
    INTERSPEECH 2022, 2022, : 176 - 180