End-to-End Topic Classification without ASR

被引:1
|
作者
Dong, Zexian [1 ]
Liu, Jia [1 ]
Zhang, Wei-Qiang [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing Natl Res Ctr Informat Sci & Technol, Beijing 100084, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
topic identification; end-to-end system; mel-frequency cepstrum coefficients; convolutional neural network;
D O I
10.1109/isspit47144.2019.9001833
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This document explores an end-to-end model for topic classification without automatic speech recognition(ASR) system. In general, we always employ the ASR system to convert the speech recording to text and then use the standard natural language processing(NLP) knowledge to complete the topic identification task. However for low-resourced language, the lack of transcribed text and good language model results in the absence of practical speech recognition system. In this case, our paper proposes an end-to-end system for topic modeling based on mel-frequency cepstrum coefficients(MFCCs) feature. Comparing with the lexical discovery methods (such as segment dynamic time warping(DTW)), our method which can be applied to large-scale dataset which significantly reduces training time and model complexity.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] ENDPOINT DETECTION FOR STREAMING END-TO-END MULTI-TALKER ASR
    Lu, Liang
    Li, Jinyu
    Gong, Yifan
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7312 - 7316
  • [42] Spelling-Aware Word-Based End-to-End ASR
    Egorova, Ekaterina
    Vydana, Hari Krishna
    Burget, Lukas
    Cernocky, Jan Honza
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1729 - 1733
  • [43] COMPARATIVE STUDY OF DIFFERENT TOKENIZATION STRATEGIES FOR STREAMING END-TO-END ASR
    Singh, Sachin
    Gupta, Ashutosh
    Maghan, Aman
    Gowda, Dhananjaya
    Singh, Shatrughan
    Kim, Chanwoo
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 388 - 394
  • [44] Semi-supervised ASR by End-to-end Self-training
    Chen, Yang
    Wang, Weiran
    Wang, Chao
    [J]. INTERSPEECH 2020, 2020, : 2787 - 2791
  • [45] Class LM and Word Mapping for Contextual Biasing in End-to-End ASR
    Huang, Rongqing
    Abdel-hamid, Ossama
    Li, Xinwei
    Evermann, Gunnar
    [J]. INTERSPEECH 2020, 2020, : 4348 - 4351
  • [46] END-TO-END ASR-FREE KEYWORD SEARCH FROM SPEECH
    Audhkhasi, Kartik
    Rosenberg, Andrew
    Sethy, Abhinav
    Ramabhadran, Bhuvana
    Kingsbury, Brian
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4840 - 4844
  • [47] END-TO-END MULTI-SPEAKER ASR WITH INDEPENDENT VECTOR ANALYSIS
    Scheibler, Robin
    Zhang, Wangyou
    Chang, Xuankai
    Watanabe, Shinji
    Qian, Yanmin
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 496 - 501
  • [48] Compressing End-to-end ASR Networks by Tensor-Train Decomposition
    Mori, Takuma
    Tjandra, Andros
    Sakti, Sakriani
    Nakamura, Satoshi
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 806 - 810
  • [49] Pretraining by Backtranslation for End-to-end ASR in Low-Resource Settings
    Wiesner, Matthew
    Renduchintala, Adithya
    Watanabe, Shinji
    Liu, Chunxi
    Dehak, Najim
    Khudanpur, Sanjeev
    [J]. INTERSPEECH 2019, 2019, : 4375 - 4379
  • [50] Semi-Supervised Learning with Data Augmentation for End-to-End ASR
    Weninger, Felix
    Mana, Franco
    Gemello, Roberto
    Andres-Ferrer, Jesus
    Zhan, Puming
    [J]. INTERSPEECH 2020, 2020, : 2802 - 2806