End-to-End Topic Classification without ASR

被引:1
|
作者
Dong, Zexian [1 ]
Liu, Jia [1 ]
Zhang, Wei-Qiang [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing Natl Res Ctr Informat Sci & Technol, Beijing 100084, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
topic identification; end-to-end system; mel-frequency cepstrum coefficients; convolutional neural network;
D O I
10.1109/isspit47144.2019.9001833
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This document explores an end-to-end model for topic classification without automatic speech recognition(ASR) system. In general, we always employ the ASR system to convert the speech recording to text and then use the standard natural language processing(NLP) knowledge to complete the topic identification task. However for low-resourced language, the lack of transcribed text and good language model results in the absence of practical speech recognition system. In this case, our paper proposes an end-to-end system for topic modeling based on mel-frequency cepstrum coefficients(MFCCs) feature. Comparing with the lexical discovery methods (such as segment dynamic time warping(DTW)), our method which can be applied to large-scale dataset which significantly reduces training time and model complexity.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] DOES SPEECH ENHANCEMENTWORK WITH END-TO-END ASR OBJECTIVES?: EXPERIMENTAL ANALYSIS OF MULTICHANNEL END-TO-END ASR
    Ochiai, Tsubasa
    Watanabe, Shinji
    Katagiri, Shigeru
    [J]. 2017 IEEE 27TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2017,
  • [2] EXTENDED GRAPH TEMPORAL CLASSIFICATION FOR MULTI-SPEAKER END-TO-END ASR
    Chang, Xuankai
    Moritz, Niko
    Hori, Takaaki
    Watanabe, Shinji
    Le Roux, Jonathan
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7322 - 7326
  • [3] END-TO-END MONAURAL MULTI-SPEAKER ASR SYSTEM WITHOUT PRETRAINING
    Chang, Xuankai
    Qian, Yanmin
    Yu, Kai
    Watanabe, Shinji
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6256 - 6260
  • [4] Towards Lifelong Learning of End-to-end ASR
    Chang, Heng-Jui
    Lee, Hung-yi
    Lee, Lin-shan
    [J]. INTERSPEECH 2021, 2021, : 2551 - 2555
  • [5] Contextual Biasing for End-to-End Chinese ASR
    Zhang, Kai
    Zhang, Qiuxia
    Wang, Chung-Che
    Jang, Jyh-Shing Roger
    [J]. IEEE ACCESS, 2024, 12 : 92960 - 92975
  • [6] UNSUPERVISED MODEL ADAPTATION FOR END-TO-END ASR
    Sivaraman, Ganesh
    Casal, Ricardo
    Garland, Matt
    Khoury, Elie
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6987 - 6991
  • [7] Phonemic competition in end-to-end ASR models
    ten Bosch, Louis
    Bentum, Martijn
    Boves, Lou
    [J]. INTERSPEECH 2023, 2023, : 586 - 590
  • [8] ASR-AWARE END-TO-END NEURAL DIARIZATION
    Khare, Aparna
    Han, Eunjung
    Yang, Yuguang
    Stolcke, Andreas
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8092 - 8096
  • [9] End-to-End Speaker-Attributed ASR with Transformer
    Kanda, Naoyuki
    Ye, Guoli
    Gaur, Yashesh
    Wang, Xiaofei
    Meng, Zhong
    Chen, Zhuo
    Yoshioka, Takuya
    [J]. INTERSPEECH 2021, 2021, : 4413 - 4417
  • [10] SPEAKER AND LANGUAGE AWARE TRAINING FOR END-TO-END ASR
    Bansal, Shubham
    Malhotra, Karan
    Ganapathy, Sriram
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 494 - 501