Advances in phase-aware signal processing in speech communication

被引：100

作者：

Mowlaee, Pejman ^{[1
]}

Saeidi, Rahim ^{[2
]}

Stylianou, Yannis ^{[3
]}

机构：

[1] Graz Univ Technol, Signal Proc & Speech Commun Lab, Graz, Austria

[2] Aalto Univ, Dept Signal Proc & Acoust, Espoo, Finland

[3] Univ Crete, Dept Comp Sci, Iraklion, Greece

来源：

SPEECH COMMUNICATION | 2016年 / 81卷

基金：

奥地利科学基金会; 芬兰科学院;

关键词：

Phase-aware speech processing; Phase-based features; Signal enhancement; Automatic speech recognition; Speaker recognition; Speech synthesis; Speech coding; Speech analysis; GROUP DELAY FUNCTIONS; SPECTRAL MAGNITUDE ESTIMATION; INTELLIGIBILITY PREDICTION; INSTANTANEOUS FREQUENCY; SOURCE SEPARATION; FOURIER SPECTRUM; MFCC FEATURES; ENHANCEMENT; RECONSTRUCTION; INFORMATION;

D O I：

10.1016/j.specom.2016.04.002

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

During the past three decades, the issue of processing spectral phase has been largely neglected in speech applications. There is no doubt that the interest of speech processing community towards the use of phase information in a big spectrum of speech technologies, from automatic speech and speaker recognition to speech synthesis, from speech enhancement and source separation to speech coding, is constantly increasing. In this paper, we elaborate on why phase was believed to be unimportant in each application. We provide an overview of advancements in phase-aware signal processing with applications to speech, showing that considering phase-aware speech processing can be beneficial in many cases, while it can complement the possible solutions that magnitude-only methods suggest. Our goal is to show that phase-aware signal processing is an important emerging field with high potential in the current speech communication applications. The paper provides an extended and up-to-date bibliography on the topic of phase aware speech processing aiming at providing the necessary background to the interested readers for following the recent advancements in the area. Our review expands the step initiated by our organized special session and exemplifies the usefulness of spectral phase information in a wide range of speech processing applications. Finally, the overview will provide some future work directions. (C) 2016 Elsevier B.V. All rights reserved.

引用

页码：1 / 29

页数：29

共 50 条

[41] COPING WITH THE ENEMY: ADVANCES IN ADVERSARY-AWARE SIGNAL PROCESSING
Barni, Mauro
Perez-Gonzalez, Fernando
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8682 - 8686
[42] Editorial: Recent Advances on Communication Signal Processing and Networking
Fan -Yi Meng
Mobile Networks and Applications, 2021, 26 : 1821 - 1822
[43] Editorial: Recent Advances on Communication Signal Processing and Networking
Meng, Fan-Yi
MOBILE NETWORKS & APPLICATIONS, 2021, 26 (05): : 1821 - 1822
[44] Phase-aware Audio Inpainting Based on Instantaneous Frequency
Tanaka, Tomoro
Yatabe, Kohei
Oikawa, Yasuhiro
2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 254 - 258
[45] Phase-Aware Transformations in Variational Autoencoders for Audio Effects
Cámara, Mateo
Blanco, José Luis
AES: Journal of the Audio Engineering Society, 2022, 70 (09): : 731 - 741
[46] Phase-Aware Projection Model for Steganalysis of JPEG Images
Holub, Vojtech
Fridrich, Jessica
MEDIA WATERMARKING, SECURITY, AND FORENSICS 2015, 2015, 9409
[47] Phase-aware non-negative spectrogram factorization
Parry, R. Mitchell
Essa, Irfan
INDEPENDENT COMPONENT ANALYSIS AND SIGNAL SEPARATION, PROCEEDINGS, 2007, 4666 : 536 - +
[48] An Image Patch is a Wave: Phase-Aware Vision MLP
Tang, Yehui
Han, Kai
Guo, Jianyuan
Xu, Chang
Li, Yanxi
Xu, Chao
Wang, Yunhe
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10925 - 10934
[49] LOW-RANKNESS OF COMPLEX-VALUED SPECTROGRAM AND ITS APPLICATION TO PHASE-AWARE AUDIO PROCESSING
Masuyama, Yoshiki
Yatabe, Kohei
Oikawa, Yasuhiro
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 855 - 859
[50] RECENT DEVELOPMENT OF SPEECH AND AUDIO SIGNAL PROCESSING IN NETWORK COMMUNICATION
Hu, Ruimin
Bao, Changchun
Zhao, Qingwei
Unoki, Masashi
Shin, Jong Won
CHINA COMMUNICATIONS, 2017, 14 (09) : III - IV

← 1 2 3 4 5 →