Exemplar-Based Processing for Speech Recognition

被引:35
|
作者
Sainath, Tara N. [1 ]
Ramabhadran, Bhuvana [2 ,3 ]
Nahamoo, David
Kanevsky, Dimitri [4 ,5 ]
Van Compernolle, Dirk [6 ,7 ]
Demuynck, Kris [8 ]
Gemmeke, Jort Florent
Bellegarda, Jerome R.
Sundaram, Shiva [9 ]
机构
[1] IBM TJ Watson Ctr, Speech & Language Algorithms Grp, Yorktown Hts, NY USA
[2] IBM TJ Watson Ctr, Speech Transcript & Synth Res Grp, Yorktown Hts, NY USA
[3] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA
[4] IBM TJ Watson Ctr, Dept Speech & Language Algorithms, Yorktown Hts, NY USA
[5] Inst Adv Studies, Princeton, NJ USA
[6] Katholieke Univ Leuven, Dept Elect Engn, Louvain, Belgium
[7] INTERSPEECH, Antwerp, Belgium
[8] Katholieke Univ Leuven, Dept Elect Engn ESAT, Louvain, Belgium
[9] Tech Univ Berlin, Berlin, Germany
关键词
SPARSE IMPUTATION; FACE RECOGNITION; CLASSIFICATION; RETRIEVAL; ENTROPY;
D O I
10.1109/MSP.2012.2208663
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Solving real-world classification and recognition problems requires a principled way of modeling the physical phenomena generating the observed data and the uncertainty in it. The uncertainty originates from the fact that many data generation aspects are influenced by nondirectly measurable variables or are too complex to model and hence are treated as random fluctuations. For example, in speech production, uncertainty could arise from vocal tract variations among different people or corruption by noise. The goal of modeling is to establish a generalization from the set of observed data such that accurate inference (classification, decision, recognition) can be made about the data yet to be observed, which we refer to as unseen data. © 2012 IEEE.
引用
下载
收藏
页码:98 / 113
页数:16
相关论文
共 50 条
  • [1] Enhancing Exemplar-Based Posteriors for Speech Recognition Tasks
    Sainath, Tara N.
    Nahamoo, David
    Kanevsky, Dimitri
    Ramabhadran, Bhuvana
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2127 - 2130
  • [2] Coupled Dictionaries for Exemplar-Based Speech Enhancement and Automatic Speech Recognition
    Baby, Deepak
    Virtanen, Tuomas
    Gemmeke, Jort F.
    van Hamme, Hugo
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (11) : 1788 - 1799
  • [3] EXEMPLAR-BASED SPEECH ENHANCEMENT FOR DEEP NEURAL NETWORK BASED AUTOMATIC SPEECH RECOGNITION
    Baby, Deepak
    Gemmeke, Jort F.
    Virtanen, Tuomas
    Van hamme, Hugo
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4485 - 4489
  • [4] Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition
    Gemmeke, Jort F.
    Virtanen, Tuomas
    Hurmalainen, Antti
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (07): : 2067 - 2080
  • [5] Exemplar-Based Emotive Speech Synthesis
    Wu, Xixin
    Cao, Yuewen
    Lu, Hui
    Liu, Songxiang
    Kang, Shiyin
    Wu, Zhiyong
    Liu, Xunying
    Meng, Helen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 874 - 886
  • [6] Exemplar-based speech waveform generation
    Watts, Oliver
    Valentini-Botinhao, Cassia
    Espic, Felipe
    King, Simon
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2022 - 2026
  • [7] Exemplar-based logo and trademark recognition
    Farajzadeh, Nacer
    MACHINE VISION AND APPLICATIONS, 2015, 26 (06) : 791 - 805
  • [8] Exemplar-based facial expression recognition
    Farajzadeh, Nacer
    Hashemzadeh, Mandi
    INFORMATION SCIENCES, 2018, 460 : 318 - 330
  • [9] Exemplar-based logo and trademark recognition
    Nacer Farajzadeh
    Machine Vision and Applications, 2015, 26 : 791 - 805
  • [10] Integrated exemplar-based template matching and statistical modeling for continuous speech recognition
    Xie Sun
    Yunxin Zhao
    EURASIP Journal on Audio, Speech, and Music Processing, 2014