Unsupervised and active learning in automatic speech recognition for call classification

被引:0
|
作者
Hakkani-Tür, D [1 ]
Tur, G [1 ]
Rahim, M [1 ]
Riccardi, G [1 ]
机构
[1] AT&T Labs Res, Florham Pk, NJ 07932 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A key challenge in rapidly building spoken natural language dialog applications is minimizing the manual effort required in transcribing and labeling speech data. This task is not only expensive but also time consuming. In this paper, we present a novel approach that aims at reducing the amount of manually transcribed in-domain data required for building automatic speech recognition (ASR) models in spoken language dialog systems. Our method is based on mining relevant text from various conversational systems and web sites. An iterative process is employed where the performance of the models can be improved through both unsupervised and active learning of the ASR models. We have evaluated the robustness of our approach on a call classification task that has been selected from AT&T VoiceTone(SM) customer care. Our results indicate that with unsupervised learning it is possible to achieve a call classification performance that is only 1.5% lower than the upper bound set when using all available in-domain transcribed data.
引用
收藏
页码:429 / 432
页数:4
相关论文
共 50 条
  • [1] Application of automatic speech recognition in call classification
    Das, SS
    Chan, N
    Wages, D
    Hansen, JHL
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3896 - 3899
  • [2] Active learning for automatic speech recognition
    Hakkani-Tür, D
    Riccardi, G
    Gorin, A
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3904 - 3907
  • [3] SUPERVISED AND UNSUPERVISED ACTIVE LEARNING FOR AUTOMATIC SPEECH RECOGNITION OF LOW-RESOURCE LANGUAGES
    Syed, Ali Raza
    Rosenberg, Andrew
    Kislal, Ellen
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5320 - 5324
  • [4] Unsupervised Automatic Speech Recognition: A review
    Aldarmaki, Hanan
    Ullah, Asad
    Ram, Sreepratha
    Zaki, Nazar
    [J]. SPEECH COMMUNICATION, 2022, 139 : 76 - 91
  • [5] Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition
    Ni, Junrui
    Wang, Liming
    Gao, Heting
    Qian, Kaizhi
    Zhang, Yang
    Chang, Shiyu
    Hasegawa-Johnson, Mark
    [J]. INTERSPEECH 2022, 2022, : 461 - 465
  • [6] ACTIVE LEARNING FOR ACCENT ADAPTATION IN AUTOMATIC SPEECH RECOGNITION
    Nallasamy, Udhyakumar
    Metze, Florian
    Schultz, Tanja
    [J]. 2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 360 - 365
  • [7] Active learning:: Theory and applications to automatic speech recognition
    Riccardi, G
    Hakkani-Tür, D
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (04): : 504 - 511
  • [8] UNSUPERVISED LEARNING APPROACH TO FEATURE ANALYSIS FOR AUTOMATIC SPEECH EMOTION RECOGNITION
    Eskimez, Sefik Emre
    Duan, Zhiyao
    Heinzelman, Wendi
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5099 - 5103
  • [9] Almost Unsupervised Text to Speech and Automatic Speech Recognition
    Ren, Yi
    Tan, Xu
    Qin, Tao
    Zhao, Sheng
    Zhao, Zhou
    Liu, Tie-Yan
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [10] Detection of GSM speech coding for telephone call classification and automatic speaker recognition
    Dabrowski, Adam
    Drgas, Szymon
    Marciniak, Tomasz
    [J]. ICSES 2008 INTERNATIONAL CONFERENCE ON SIGNALS AND ELECTRONIC SYSTEMS, CONFERENCE PROCEEDINGS, 2008, : 415 - 418