N-BEST ENTROPY BASED DATA SELECTION FOR ACOUSTIC MODELING

被引:0
|
作者
Itoh, Nobuyasu [1 ]
Sainath, Tara N. [2 ]
Liang, Dan Ning [3 ]
Zhou, Lie [3 ]
Ramabhadran, Bhuvana [2 ]
机构
[1] IBM Japan Ltd, IBM Res Tokyo, Yamato 2428502, Japan
[2] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
[3] IBM Res Corp, Beijing 100193, Peoples R China
关键词
N-best entropy; Acoustic modeling; Active learning; Data selection; Speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a strategy for efficiently selecting informative data from large corpora of untranscribed speech. Confidence-based selection methods (i.e., selecting utterances we are least confident about) have been a popular approach, though they only look at the top hypothesis when selecting utterances and tend to select outliers, therefore, not always improving overall recognition accuracy. Alternatively, we propose a method for selecting data looking at competing hypothesis by computing entropy of N-best hypothesis decoded by the baseline acoustic model. In addition we address the issue of outliers by calculating how representative a specific utterance is to all other unselected utterances via a tf-idf score. Experiments show that N-best entropy based selection (%relative 5.8 in 400-hour corpus) outperformed other conventional selection strategies; confidence based and lattice entropy based, and that tf-idfbased representativeness improved the model further (%relative 6.2). A comparison with random selection is also presented. Finally model size impact is discussed.
引用
收藏
页码:4133 / 4136
页数:4
相关论文
共 50 条
  • [31] BERT-based Semantic Model for Rescoring N-best Speech Recognition List
    Fohr, Dominique
    Illina, Irina
    INTERSPEECH 2021, 2021, : 1867 - 1871
  • [32] Statistical n-Best AFD-Based Sparse Representation for ECG Biometric Identification
    Tan, Chunyu
    Zhang, Liming
    Qian, Tao
    Bras, Susana
    Pinho, Armando J.
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
  • [33] Growing Together: Modeling Human Language Learning With n-Best Multi-Checkpoint Machine Translation
    Nagoudi, El Moatez Billah
    Abdul-Mageed, Muhammad
    Cavusoglu, Hasan
    NEURAL GENERATION AND TRANSLATION, 2020, : 169 - 177
  • [34] N-best Based Stochastic Mapping on Stereo HMM for Noise Robust Speech Recognition
    Cui, Xiaodong
    Afify, Mohamed
    Gao, Yuqing
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1261 - +
  • [35] EXPLOITING RICH FEATURE REPRESENTATION FOR SMT N-BEST RERANKING
    Tong, Yu
    Wong, Derek F.
    Chao, Lidia S.
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION (ICWAPR), 2016, : 101 - 106
  • [36] N-BEST ERROR SIMULATION FOR TRAINING SPOKEN DIALOGUE SYSTEMS
    Thomson, Blaise
    Gasic, Milica
    Henderson, Matthew
    Tsiakoulis, Pirros
    Young, Steve
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 37 - 42
  • [37] IMPROVING ASR ERROR CORRECTION USING N-BEST HYPOTHESES
    Zhu, Linchen
    Liu, Wenjie
    Liu, Linquan
    Lin, Edward
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 83 - 89
  • [38] n-Best kernel approximation in reproducing kernel Hilbert spaces
    Qian, Tao
    APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2023, 67
  • [39] N-best vector quantization for isolated word speech recognition
    Nose, Masaya
    Maki, Shuichi
    Yartiane, Noburnoto
    Morikawa, Yoshitaka
    PROCEEDINGS OF SICE ANNUAL CONFERENCE, VOLS 1-8, 2007, : 2053 - +
  • [40] Determination of the number of candidates using recognition scores for N-best based speech interface
    Cho, K
    Yamashita, Y
    Proceedings of the Sixth IASTED International Conference on Signal and Image Processing, 2004, : 268 - 272