The regularized SNN-TA model for recognition of noisy speech

被引：2

作者：

Trentin, E ^{[1
]}

Matassoni, M ^{[1
]}

机构：

[1] ITC Irst, Trent, Italy

来源：

IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL V | 2000年

关键词：

D O I：

10.1109/IJCNN.2000.861441

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Segmental Neural Network (SNN) architecture was introduced at BBN by Zavaliagkos et al. for rescoring the N-best hypothesis yielded by a standard Continuous Density hidden Markov model (CDHMM) applied to Automatic Speech Recognition. An enhanced connectionist model, called SNN with trainable amplitude of activation functions (SNN-TA) is first used in this paper instead of the CDHMM to perform the recognition of isolated words. Viterbi-based segmentation is then introduced, relying on the level building algorithm, that can be combined with the SNN-TA to obtain a hybrid framework for continuous speech recognition. The present paradigm is applied to the recognition of isolated digits, collected in a real car environment under several noisy conditions (traffic, speed, road conditions, etc.) using a microphone placed far from the talker. We stress the fact that robustness to noise can be increased by improving the generalization capabilities of the speech recognizer. In this perspective, while CDHMMs completely lack of a proper regularization theory, a regularized SNN-TA model is discussed, which yields effective generalization and noise-tolerance, outperforming the CDHMM on the noisy task under consideration.

引用

页码：97 / 102

页数：6

共 50 条

[41] Speech Emotion Recognition in Noisy and Reverberant Environments
Heracleous, Panikos
Yasuda, Keiji
Sugaya, Fumiaki
Yoneyama, Akio
Hashimoto, Masayuki
2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2017, : 262 - 266
[42] Multisensory benefits for speech recognition in noisy environments
Oh, Yonghee
Schwalm, Meg
Kalpin, Nicole
FRONTIERS IN NEUROSCIENCE, 2022, 16
[43] SPEECH RECOGNITION IN UNSEEN AND NOISY CHANNEL CONDITIONS
Mitra, Vikramjit
Franco, Horacio
Bartels, Chris
van Hout, Julien
Graciarena, Martin
Vergyri, Dimitra
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5215 - 5219
[44] Energy contour enhancement for noisy speech recognition
Hwang, TH
Chang, SC
2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 249 - 252
[45] INDEPENDENT COMPONENT ANALYSIS FOR NOISY SPEECH RECOGNITION
Hsieh, Hsin-Lung
Chien, Jen-Tzung
Shinoda, Koichi
Furui, Sadaoki
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4369 - +
[46] Nanophotonic reservoir computing for noisy speech recognition
Salehi, M. R.
Dehyadegari, L.
OPTICAL AND QUANTUM ELECTRONICS, 2016, 48 (05)
[47] Noisy Hidden Markov Models for Speech Recognition
Audhkhasi, Kartik
Osoba, Osonde
Kosko, Bart
2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,
[48] Temporal feature selection for noisy speech recognition
Department of Computer Science and Software Engineering, Université Laval, Quebec
QC
G1V 0A6, Canada
Lect. Notes Comput. Sci., (155-166):
[49] Speech Recognition On Mobile Devices In Noisy Environments
Yurtcan, Yaser
Kilic, Banu Gunel
2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
[50] Temporal Feature Selection for Noisy Speech Recognition
Trottier, Ludovic
Chaib-draa, Brahim
Giguere, Philippe
ADVANCES IN ARTIFICIAL INTELLIGENCE (AI 2015), 2015, 9091 : 155 - 166

← 1 2 3 4 5 →