Towards an intelligent acoustic front-end for automatic speech recognition:built-in speaker normalization (BISN)

被引:0
|
作者
Yapanel, UH [1 ]
Hansen, JHL [1 ]
机构
[1] Univ Colorado, Ctr Spoken Language Res, Robust Speech Proc Grp, Boulder, CO 80309 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Much effort has transpired over the past three decades in the formulation of "ideal" acoustic features which represent the speech signal in a discriminative and compact manner while being robust to adverse conditions and invariant to speaker differences. A good way of making ASR systems invariant to speaker differences is to perform speaker normalization on the input features. The most popular speaker normalization technique is the vocal tract length normalization (VTLN). However, its implementation requires immense computational resources and not practically applicable ill real-time/embedded ASR systems. In this paper, we propose a new speaker normalization algorithm entitled Built-in Speaker Normalization (BISN) which is performed on-the-fly within the newly proposed PMVDR acoustic front-end and reduces computational resources significantly enabling its use within contemporary ASR systems. Evaluations using an in-car extended digit recognition task showed that on-the-fly implementation of the BISN algorithm produced a relative word error rate (WER) reduction of 24% compared to a no speaker normalization baseline.
引用
收藏
页码:949 / 952
页数:4
相关论文
共 50 条
  • [1] Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization
    Umit H. Yapanel
    John H.L. Hansen
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2008
  • [2] Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization
    Yapanel, Umit H.
    Hansen, John H. L.
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2008, 2008 (1)
  • [3] An efficient front-end for automatic speech recognition
    Ahadi, SM
    Sheikhzadeh, H
    Brennan, RL
    Freeman, GH
    [J]. ICECS 2003: PROCEEDINGS OF THE 2003 10TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS, VOLS 1-3, 2003, : 128 - 131
  • [4] Automatic Speech Recognition with a Cochlear Implant Front-End
    Nogueira, Waldo
    Harczos, Tamas
    Edler, Bernd
    Ostermann, Joern
    Buechner, Andreas
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1993 - +
  • [5] A Front-End Technique for Automatic Noisy Speech Recognition
    Naing, Hay Mar Soe
    Hidayat, Risanuri
    Hartanto, Rudy
    Miyanaga, Yoshikazu
    [J]. PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 49 - 54
  • [6] Perceptual MVDR-based Unsupervised Built-in Speaker Normalization for Kazakh Speech Recognition
    Yessenbayev, Zhandos
    Yapanel, Umit
    [J]. 2014 IEEE 8th International Conference on Application of Information and Communication Technologies (AICT), 2014, : 87 - 91
  • [7] An investigation into front-end signal processing for speaker normalization
    Umesh, S
    Sinha, R
    Kumar, SVB
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 345 - 348
  • [8] Auditory masking based acoustic front-end for robust speech recognition
    Paliwal, KK
    Lilly, BT
    [J]. IEEE TENCON'97 - IEEE REGIONAL 10 ANNUAL CONFERENCE, PROCEEDINGS, VOLS 1 AND 2: SPEECH AND IMAGE TECHNOLOGIES FOR COMPUTING AND TELECOMMUNICATIONS, 1997, : 165 - 168
  • [9] A Reassigned Front-End for Speech Recognition
    Tryfou, Georgina
    Omologo, Maurizio
    [J]. 2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2017, : 553 - 557
  • [10] RF Front-End Test Using Built-in Sensors
    Abdallah, Louay
    Stratigopoulos, Haralampos-G.
    Mir, Salvador
    Kelma, Christophe
    [J]. IEEE DESIGN & TEST OF COMPUTERS, 2011, 28 (06): : 76 - 84