MAPGN: MASKED POINTER-GENERATOR NETWORK FOR SEQUENCE-TO-SEQUENCE PRE-TRAINING

被引:2
|
作者
Ihori, Mana [1 ]
Makishima, Naoki [1 ]
Tanaka, Tomohiro [1 ]
Takashima, Akihiko [1 ]
Orihashi, Shota [1 ]
Masumura, Ryo [1 ]
机构
[1] NTT Corp, NTT Media Intelligence Labs, Tokyo, Japan
关键词
sequence-to-sequence pre-training; pointer-generator networks; self-supervised learning; spoken-text normalization;
D O I
10.1109/ICASSP39728.2021.9414738
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a self-supervised learning method for pointer-generator networks to improve spoken-text normalization. Spoken-text normalization that converts spoken-style text into style normalized text is becoming an important technology for improving subsequent processing such as machine translation and summarization. The most successful spoken-text normalization method to date is sequence-to-sequence (seq2seq) mapping using pointer-generator networks that possess a copy mechanism from an input sequence. However, these models require a large amount of paired data of spoken-style text and style normalized text, and it is difficult to prepare such a volume of data. In order to construct spoken-text normalization model from the limited paired data, we focus on self-supervised learning which can utilize unpaired text data to improve seq2seq models. Unfortunately, conventional self-supervised learning methods do not assume that pointer-generator networks are utilized. Therefore, we propose a novel self-supervised learning method, MAsked Pointer-Generator Network (MAPGN). The proposed method can effectively pre-train the pointer-generator network by learning to fill masked tokens using the copy mechanism. Our experiments demonstrate that MAPGN is more effective for pointer-generator networks than the conventional self-supervised learning methods in two spoken-text normalization tasks.
引用
收藏
页码:7563 / 7567
页数:5
相关论文
共 50 条
  • [21] JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation
    Mao, Zhuoyuan
    Cromieres, Fabien
    Dabre, Raj
    Song, Haiyue
    Kurohashi, Sadao
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3683 - 3691
  • [22] Entity Relations Based Pointer-Generator Network for Abstractive Text Summarization
    Huang, Tiancheng
    Lu, Guangquan
    Li, Zexin
    Song, Jiagang
    Wu, Lijuan
    [J]. ADVANCED DATA MINING AND APPLICATIONS, ADMA 2021, PT II, 2022, 13088 : 219 - 236
  • [23] Improving Pointer-Generator Network with Keywords Information for Chinese Abstractive Summarization
    Jiang, Xiaoping
    Hu, Po
    Hou, Liwei
    Wang, Xia
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT I, 2018, 11108 : 464 - 474
  • [24] A Fuzzy Training Framework for Controllable Sequence-to-Sequence Generation
    Li, Jiajia
    Wang, Ping
    Li, Zuchao
    Liu, Xi
    Utiyama, Masao
    Sumita, Eiichiro
    Zhao, Hai
    Ai, Haojun
    [J]. IEEE ACCESS, 2022, 10 : 92467 - 92480
  • [25] Named Entity Transliteration with Sequence-to-Sequence Neural Network
    Li, Zhongwei
    Chng, Eng Siong
    Li, Haizhou
    [J]. 2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 374 - 378
  • [26] Automatic Generation of Source Code Comments Model Based on Pointer-generator Network
    Niu, Chang-An
    Ge, Ji-Dong
    Tang, Ze
    Li, Chuan-Yi
    Zhou, Yu
    Luo, Bin
    [J]. Ruan Jian Xue Bao/Journal of Software, 2021, 32 (07): : 2142 - 2165
  • [27] BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
    Nguyen Luong Tran
    Duong Minh Le
    Dat Quoc Nguyen
    [J]. INTERSPEECH 2022, 2022, : 1751 - 1755
  • [28] Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation
    Guo, Junliang
    Xu, Linli
    Chen, Enhong
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 376 - 385
  • [29] COUPLED TRAINING OF SEQUENCE-TO-SEQUENCE MODELS FOR ACCENTED SPEECH RECOGNITION
    Unni, Vinit
    Joshi, Nitish
    Jyothi, Preethi
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8254 - 8258
  • [30] A Sequence-to-Sequence Framework Based on Transformer With Masked Language Model for Optical Music Recognition
    Wen, Cuihong
    Zhu, Longjiao
    [J]. IEEE ACCESS, 2022, 10 : 118243 - 118252