MAPGN: MASKED POINTER-GENERATOR NETWORK FOR SEQUENCE-TO-SEQUENCE PRE-TRAINING

被引:2
|
作者
Ihori, Mana [1 ]
Makishima, Naoki [1 ]
Tanaka, Tomohiro [1 ]
Takashima, Akihiko [1 ]
Orihashi, Shota [1 ]
Masumura, Ryo [1 ]
机构
[1] NTT Corp, NTT Media Intelligence Labs, Tokyo, Japan
关键词
sequence-to-sequence pre-training; pointer-generator networks; self-supervised learning; spoken-text normalization;
D O I
10.1109/ICASSP39728.2021.9414738
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a self-supervised learning method for pointer-generator networks to improve spoken-text normalization. Spoken-text normalization that converts spoken-style text into style normalized text is becoming an important technology for improving subsequent processing such as machine translation and summarization. The most successful spoken-text normalization method to date is sequence-to-sequence (seq2seq) mapping using pointer-generator networks that possess a copy mechanism from an input sequence. However, these models require a large amount of paired data of spoken-style text and style normalized text, and it is difficult to prepare such a volume of data. In order to construct spoken-text normalization model from the limited paired data, we focus on self-supervised learning which can utilize unpaired text data to improve seq2seq models. Unfortunately, conventional self-supervised learning methods do not assume that pointer-generator networks are utilized. Therefore, we propose a novel self-supervised learning method, MAsked Pointer-Generator Network (MAPGN). The proposed method can effectively pre-train the pointer-generator network by learning to fill masked tokens using the copy mechanism. Our experiments demonstrate that MAPGN is more effective for pointer-generator networks than the conventional self-supervised learning methods in two spoken-text normalization tasks.
引用
收藏
页码:7563 / 7567
页数:5
相关论文
共 50 条
  • [41] Sequence-to-sequence Prediction of Personal Computer Software by Recurrent Neural Network
    Yang, Qichuan
    He, Zhiqiang
    Ge, Fujiang
    Zhang, Yang
    [J]. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 934 - 940
  • [42] Abstractive Text Summarization Using Pointer-Generator Networks With Pre-trained Word Embedding
    Dang Trung Anh
    Nguyen Thi Thu Trang
    [J]. SOICT 2019: PROCEEDINGS OF THE TENTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY, 2019, : 473 - 478
  • [43] SAE-PD-Seq: sequence autoencoder-based pre-training of decoder for sequence learning tasks
    Jyostna Devi Bodapati
    [J]. Signal, Image and Video Processing, 2021, 15 : 1453 - 1459
  • [44] SAE-PD-Seq: sequence autoencoder-based pre-training of decoder for sequence learning tasks
    Bodapati, Jyostna Devi
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2021, 15 (07) : 1453 - 1459
  • [45] Sequence-to-Sequence Load Disaggregation Using Multiscale Residual Neural Network
    Zhou, Gan
    Li, Zhi
    Fu, Meng
    Feng, Yanjun
    Wang, Xingyao
    Huang, Chengwei
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
  • [46] Contrastive Pre-training with Adversarial Perturbations for Check-in Sequence Representation Learning
    Gong, Letian
    Lin, Youfang
    Guo, Shengnan
    Lin, Yan
    Wang, Tianyi
    Zheng, Erwen
    Zhou, Zeyu
    Wan, Huaiyu
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 4, 2023, : 4276 - 4283
  • [47] Sequence-to-sequence transfer transformer network for automatic flight plan generation
    Yang, Yang
    Qian, Shengsheng
    Zhang, Minghua
    Cai, Kaiquan
    [J]. IET INTELLIGENT TRANSPORT SYSTEMS, 2024, 18 (05) : 904 - 915
  • [48] Dictionary Augmented Sequence-to-Sequence Neural Network for Grapheme to Phoneme prediction
    Bruguier, Antoine
    Bakhtin, Anton
    Sharma, Dravyansh
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3733 - 3737
  • [49] A spatio-temporal sequence-to-sequence network for traffic flow prediction
    Cao, Shuqin
    Wu, Libing
    Wu, Jia
    Wu, Dan
    Li, Qingan
    [J]. INFORMATION SCIENCES, 2022, 610 : 185 - 203
  • [50] Automatic Pronunciation Generator for Indonesian Speech Recognition System Based on Sequence-to-Sequence Model
    Hoesen, Devin
    Putri, Fanda Yuliana
    Lestari, Dessi Puji
    [J]. 2019 22ND CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA), 2019, : 7 - 12