Application of Inverse Filtering in Enhancement of Whisper Recognition

被引:0
|
作者
Grozdic, Dorde T. [1 ,2 ]
Jovicic, Slobodan T. [1 ,2 ]
Galic, Jovan [3 ]
Markovic, Branko [4 ]
机构
[1] Univ Belgrade, Sch Elect Engn, Bulevar Kralja Aleksandra 73, Belgrade 11000, Serbia
[2] Life Act Adv Ctr, Lab Forens Acoust & Phonet, Belgrade 11000, Serbia
[3] Univ Banja Luka, Fac Elect Engn, Banja Luka, Bosnia & Herceg
[4] Cacak Tech Coll, Cacak, Serbia
关键词
ANN; Inverse filtering; MPL; Speech recognition; Whisper;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The differences between normal speech and whisper, particularly in terms of their acoustic characteristics, are serious problem of ASR (Automatic Speech Recognition) systems. This paper presents the preliminary results of the new way of speech signal pre-processing, which is based on inverse filtering. This method of signal pre-processing improves whisper recognition with ANNs (Artificial Neural Networks). The ANNs showed high capabilities in speech and whisper recognition in matched train/test scenarios, with the average recognition accuracy of 99.8%. However, the recognition scores in mismatched train/test scenarios were highly degraded. Because of their practical significance, the mismatched train/test scenarios were analyzed in detail in this research. Particularly, the speech/whisper scenario is important. This scenario corresponds to real life situation when speaker is in front of ASR system and from speech switches to whisper. The use of inverse filter enhanced whisper recognition by 9.48%, which in this scenario amounts 70.25%.
引用
收藏
页码:157 / 161
页数:5
相关论文
共 50 条
  • [31] Whisper Intelligibility Enhancement Using a Supervised Learning Approach
    Zhou, Jian
    Liang, Ruiyu
    Zhao, Li
    Zou, Cairong
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2012, 31 (06) : 2061 - 2074
  • [32] Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
    Rouditchenko, Andrew
    Gong, Yuan
    Thomas, Samuel
    Karlinsky, Leonid
    Kuehne, Hilde
    Feris, Rogerio
    Glass, James
    INTERSPEECH 2024, 2024, : 2420 - 2424
  • [33] Search organization in the whisper continuous speech recognition system
    Alleva, F
    1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 295 - 302
  • [34] Analysis and Calibration of Lombard Effect and Whisper for Speaker Recognition
    Kelly, Finnian
    Hansen, John H. L.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 927 - 942
  • [35] A POLYNOMIAL APPROACH TO OPTIMAL AND ADAPTIVE FILTERING WITH APPLICATION TO SPEECH ENHANCEMENT
    MOIR, TJ
    CAMPBELL, DR
    DABIS, HS
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1991, 39 (05) : 1221 - 1224
  • [36] Inverse filtering of radar signals using compressed sensing with application to meteors
    Volz, Ryan
    Close, Sigrid
    RADIO SCIENCE, 2012, 47
  • [37] APPLICATION OF HARMONIC-ANALYSIS AND STANDARD INVERSE FILTERING TO A DTA SYSTEM
    FONT, J
    MUNTASELL, J
    NAVARRO, J
    TAMARIT, JL
    CESARI, E
    THERMOCHIMICA ACTA, 1986, 108 : 337 - 343
  • [38] The Enhancement and Application of Collaborative Filtering in e-Learning System
    Song, Bo
    Gao, Jie
    ADVANCES IN SWARM INTELLIGENCE, ICSI 2014, PT II, 2014, 8795 : 188 - 195
  • [39] Application of adaptive inverse filtering approach in weigh-in-motion of vehicles
    Yu Jinsong
    Wu Jie
    Wan Jiuqing
    Li Xingshan
    SENSORS, AUTOMATIC MEASUREMENT, CONTROL, AND COMPUTER SIMULATION, PTS 1 AND 2, 2006, 6358
  • [40] APPLICATION OF DIGITAL INVERSE FILTERING TECHNIQUES TO SOUND FIELD SPECTRAL FLATTENING
    TALKIN, D
    OPAUSKI, M
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 62 : S12 - S12