SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge

被引:0
|
作者
Zhu, Jiaxu [1 ]
Song, Changhe [1 ,2 ]
Wu, Zhiyong [1 ,2 ,3 ]
Meng, Helen [3 ]
机构
[1] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen, Peoples R China
[2] Peng Cheng Lab, Shenzhen, Peoples R China
[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China
来源
关键词
speech recognition; sememe; long-tailed problem; domain generalization;
D O I
10.21437/Interspeech.2023-1432
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recently, excellent progress has been made in speech recognition. However, pure data-driven approaches have struggled to solve the problem in domain-mismatch and long-tailed data. Considering that knowledge-driven approaches can help data-driven approaches alleviate their flaws, we introduce sememebased semantic knowledge information to speech recognition (SememeASR). Sememe, according to the linguistic definition, is the minimum semantic unit in a language and is able to represent the implicit semantic information behind each word very well. Our experiments show that the introduction of sememe information can improve the effectiveness of speech recognition. In addition, our further experiments show that sememe knowledge can improve the model's recognition of long-tailed data and enhance the model's domain generalization ability.
引用
收藏
页码:3272 / 3276
页数:5
相关论文
共 4 条
  • [1] Semantic Data Augmentation for End-to-End Mandarin Speech Recognition
    Sun, Jianwei
    Tang, Zhiyuan
    Yin, Hengxin
    Wang, Wei
    Zhao, Xi
    Zhao, Shuaijiang
    Lei, Xiaoning
    Zou, Wei
    Li, Xiangang
    INTERSPEECH 2021, 2021, : 1269 - 1273
  • [2] IMPROVING CONFIDENCE ESTIMATION ON OUT-OF-DOMAIN DATA FOR END-TO-END SPEECH RECOGNITION
    Li, Qiujia
    Zhang, Yu
    Qiu, David
    He, Yanzhang
    Cao, Liangliang
    Woodland, Philip C.
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6537 - 6541
  • [3] Text Only Domain Adaptation with Phoneme Guided Data Splicing for End-to-End Speech Recognition
    Wang, Wei
    Gong, Xun
    Shao, Hang
    Yang, Dongning
    Qian, Yanmin
    INTERSPEECH 2023, 2023, : 3347 - 3351
  • [4] Integrating Knowledge Into End-to-End Speech Recognition From External Text-Only Data
    Bai, Ye
    Yi, Jiangyan
    Tao, Jianhua
    Wen, Zhengqi
    Tian, Zhengkun
    Zhang, Shuai
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 1340 - 1351