Attention Weight Smoothing Using Prior Distributions for Transformer-Based End-to-End ASR

被引:0
|
作者
Maekaku, Takashi [1 ]
Fujita, Yuya [1 ]
Peng, Yifan [2 ]
Watanabe, Shinji [2 ]
机构
[1] Yahoo Japan Corporation, Tokyo, Japan
[2] Carnegie Mellon University, PA, United States
关键词
751.5; Speech;
D O I
暂无
中图分类号
学科分类号
摘要
29
引用
收藏
页码:1071 / 1075
相关论文
共 50 条
  • [31] End-to-End ASR with Adaptive Span Self-Attention
    Chang, Xuankai
    Subramanian, Aswin Shanmugam
    Guo, Pengcheng
    Watanabe, Shinji
    Fujita, Yuya
    Omachi, Motoi
    INTERSPEECH 2020, 2020, : 3595 - 3599
  • [32] Transformer-Based End-to-End Classification of Variable-Length Volumetric Data
    Oghbaie, Marzieh
    Araujo, Teresa
    Emre, Taha
    Schmidt-Erfurth, Ursula
    Bogunovic, Hrvoje
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VI, 2023, 14225 : 358 - 367
  • [33] TOD-Net: An end-to-end transformer-based object detection network
    Sirisha, Museboyina
    Sudha, S. V.
    COMPUTERS & ELECTRICAL ENGINEERING, 2023, 108
  • [34] TRANSFORMER-BASED STREAMING ASR WITH CUMULATIVE ATTENTION
    Li, Mohan
    Zhang, Shucong
    Zorila, Catalin
    Doddipatla, Rama
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8272 - 8276
  • [35] End-to-End Asbestos Roof Detection on Orthophotos Using Transformer-Based YOLO Deep Neural Network
    Pace, Cesare Davide
    Bria, Alessandro
    Focareta, Mariano
    Lozupone, Gabriele
    Marrocco, Claudio
    Meoli, Giuseppe
    Molinara, Mario
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT I, 2023, 14233 : 232 - 244
  • [36] STREAMING BILINGUAL END-TO-END ASR MODEL USING ATTENTION OVER MULTIPLE SOFTMAX
    Patil, Aditya
    Joshi, Vikas
    Agrawal, Purvi
    Mehta, Rupesh
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 252 - 259
  • [37] Leveraging Text Data Using Hybrid Transformer-LSTM Based End-to-End ASR in Transfer Learning
    Zeng, Zhiping
    Pham, Van Tung
    Xu, Haihua
    Khassanov, Yerbolat
    Chng, Eng Siong
    Ni, Chongjia
    Ma, Bin
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [38] OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images
    Zhao, Jiaqi
    Ding, Zeyu
    Zhou, Yong
    Zhu, Hancheng
    Du, Wen-Liang
    Yao, Rui
    El Saddik, Abdulmotaleb
    IEEE Transactions on Geoscience and Remote Sensing, 2024, 62
  • [39] FiLM Conditioning with Enhanced Feature to the Transformer-based End-to-End Noisy Speech Recognition
    Yang, Da-Hee
    Chang, Joon-Hyuk
    INTERSPEECH 2022, 2022, : 4098 - 4102
  • [40] HyperSFormer: A Transformer-Based End-to-End Hyperspectral Image Classification Method for Crop Classification
    Xie, Jiaxing
    Hua, Jiajun
    Chen, Shaonan
    Wu, Peiwen
    Gao, Peng
    Sun, Daozong
    Lyu, Zhendong
    Lyu, Shilei
    Xue, Xiuyun
    Lu, Jianqiang
    REMOTE SENSING, 2023, 15 (14)