A CNN-based approach to identification of degradations in speech signals

被引：0

作者：

Yuki Saishu

Amir Hossein Poorjam

Mads Græsbøll Christensen

机构：

[1] Audio Analysis Lab,

[2] CREATE,undefined

[3] Aalborg University,undefined

[4] Verisk Analytics,undefined

来源：

EURASIP Journal on Audio, Speech, and Music Processing | / 2021卷

关键词：

Signal enhancement; Convolutional neural network; Identification of degradation; Quality control; Visualization;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The presence of degradations in speech signals, which causes acoustic mismatch between training and operating conditions, deteriorates the performance of many speech-based systems. A variety of enhancement techniques have been developed to compensate the acoustic mismatch in speech-based applications. To apply these signal enhancement techniques, however, it is necessary to know prior information about the presence and the type of degradations in speech signals. In this paper, we propose a new convolutional neural network (CNN)-based approach to automatically identify the major types of degradations commonly encountered in speech-based applications, namely additive noise, nonlinear distortion, and reverberation. In this approach, a set of parallel CNNs, each detecting a certain degradation type, is applied to the log-mel spectrogram of audio signals. Experimental results using two different speech types, namely pathological voice and normal running speech, show the effectiveness of the proposed method in detecting the presence and the type of degradations in speech signals which outperforms the state-of-the-art method. Using the score weighted class activation mapping, we provide a visual analysis of how the network makes decision for identifying different types of degradation in speech signals by highlighting the regions of the log-mel spectrogram which are more influential to the target degradation.

引用

共 50 条

[21] Hybrid Acceleration of CNN-based Speech Enhancement on Embedded Platforms
Li, Kaixu
Pan, Ruixiang
Wei, Lei
Yan, Bo
Lin, Jiazhen
Zhang, Xiaoyan
2021 6TH INTERNATIONAL CONFERENCE ON UK-CHINA EMERGING TECHNOLOGIES (UCET 2021), 2021, : 53 - 58
[22] CNN-based Stochastic Regression for IDDQ Outlier Identification
Chen, Chun-Teng
Yen, Chia-Heng
Wen, Cheng-Yen
Yang, Cheng-Hao
Wu, Kai-Chiang
Chern, Mason
Chen, Ying-Yen
Kuo, Chun-Yi
Lee, Jih-Nung
Kao, Shu-Yi
Chao, Mango Chia-Tso
2020 IEEE 38TH VLSI TEST SYMPOSIUM (VTS 2020), 2020,
[23] CNN-Based End-To-End Language Identification
Wang, Yutian
Zhou, Huan
Wang, Zheng
Wang, Jingling
Wang, Hui
PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 2475 - 2479
[24] Speech enhancement by LSTM-based noise suppression followed by CNN-based speech restoration
Strake, Maximilian
Defraene, Bruno
Fluyt, Kristoff
Tirry, Wouter
Fingscheidt, Tim
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2020, 2020 (01)
[25] Speech enhancement by LSTM-based noise suppression followed by CNN-based speech restoration
Maximilian Strake
Bruno Defraene
Kristoff Fluyt
Wouter Tirry
Tim Fingscheidt
EURASIP Journal on Advances in Signal Processing, 2020
[26] CNN-Based Stochastic Regression for IDDQ Outlier Identification
Yen, Chia-Heng
Chen, Chun-Teng
Wen, Cheng-Yen
Chen, Ying-Yen
Lee, Jih-Nung
Kao, Shu-Yi
Wu, Kai-Chiang
Chao, Mango Chia-Tso
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (11) : 4282 - 4295
[27] Automatic Stones Classification through a CNN-Based Approach
Tropea, Mauro
Fedele, Giuseppe
De Luca, Raffaella
Miriello, Domenico
De Rango, Floriano
SENSORS, 2022, 22 (16)
[28] CNN-based Approach for Visual Quality Improvement on HEVC
Lee, Young-woon
Kim, Ji-hae
Choi, Young-ju
Kim, Byung-gyu
2018 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2018,
[29] A Temporal CNN-based Approach for Autonomous Drone Racing
Oyuki Rojas-Perez, L.
Martinez-Carranza, J.
2019 INTERNATIONAL WORKSHOP ON RESEARCH, EDUCATION AND DEVELOPMENT OF UNMANNED AERIAL SYSTEMS (RED UAS 2019), 2019, : 70 - 77
[30] Static, Dynamic and Acceleration Features for CNN-Based Speech Emotion Recognition
Khalifa, Intissar
Ejbali, Ridha
Napoletano, Paolo
Schettini, Raimondo
Zaied, Mourad
AIXIA 2021 - ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13196 : 348 - 358

← 1 2 3 4 5 →