End-to-End Topic Classification without ASR

被引:1
|
作者
Dong, Zexian [1 ]
Liu, Jia [1 ]
Zhang, Wei-Qiang [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing Natl Res Ctr Informat Sci & Technol, Beijing 100084, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
topic identification; end-to-end system; mel-frequency cepstrum coefficients; convolutional neural network;
D O I
10.1109/isspit47144.2019.9001833
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This document explores an end-to-end model for topic classification without automatic speech recognition(ASR) system. In general, we always employ the ASR system to convert the speech recording to text and then use the standard natural language processing(NLP) knowledge to complete the topic identification task. However for low-resourced language, the lack of transcribed text and good language model results in the absence of practical speech recognition system. In this case, our paper proposes an end-to-end system for topic modeling based on mel-frequency cepstrum coefficients(MFCCs) feature. Comparing with the lexical discovery methods (such as segment dynamic time warping(DTW)), our method which can be applied to large-scale dataset which significantly reduces training time and model complexity.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] Data Augmentation Using CycleGAN for End-to-End Children ASR
    Singh, Dipesh K.
    Amin, Preet P.
    Sailor, Hardik B.
    Patil, Hemant A.
    [J]. 29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 511 - 515
  • [22] Auxiliary feature based adaptation of end-to-end ASR systems
    Delcroix, Marc
    Watanabe, Shinji
    Ogawa, Atsunori
    Karita, Shigeki
    Nakatani, Tomohiro
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2444 - 2448
  • [23] End-to-end ASR to jointly predict transcriptions and linguistic annotations
    Omachi, Motoi
    Fujita, Yuya
    Watanabe, Shinji
    Wiesner, Matthew
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 1861 - 1871
  • [24] Multi-Modal Data Augmentation for End-to-End ASR
    Renduchintala, Adithya
    Ding, Shuoyang
    Wiesner, Matthew
    Watanabe, Shinji
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2394 - 2398
  • [25] End-to-End ASR with Adaptive Span Self-Attention
    Chang, Xuankai
    Subramanian, Aswin Shanmugam
    Guo, Pengcheng
    Watanabe, Shinji
    Fujita, Yuya
    Omachi, Motoi
    [J]. INTERSPEECH 2020, 2020, : 3595 - 3599
  • [26] TWO-PASS END-TO-END ASR MODEL COMPRESSION
    Dawalatabad, Nauman
    Vatsal, Tushar
    Gupta, Ashutosh
    Kim, Sungsoo
    Singh, Shatrughan
    Gowda, Dhananjaya
    Kim, Chanwoo
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 403 - 410
  • [27] An End-to-end Topic-Enhanced Self-Attention Network for Social Emotion Classification
    Wang, Chang
    Wang, Bang
    [J]. WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 2210 - 2219
  • [28] LOW-FREQUENCY CHARACTER CLUSTERING FOR END-TO-END ASR SYSTEM
    Ito, Hitoshi
    Hagiwara, Aiko
    Ichiki, Manon
    Kobayakawa, Takeshi
    Mishima, Takeshi
    Sato, Shoei
    Kobayashi, Akio
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 187 - 191
  • [29] Radically Old Way of Computing Spectra: Applications in End-to-End ASR
    Sadhu, Samik
    Hermansky, Hynek
    [J]. INTERSPEECH 2021, 2021, : 1424 - 1428
  • [30] An end-to-end continuous Kannada ASR system under uncontrolled environment
    G. Thimmaraja Yadava
    B. G. Nagaraja
    H. S. Jayanna
    [J]. Multimedia Tools and Applications, 2024, 83 : 7981 - 7994