MAPGN: MASKED POINTER-GENERATOR NETWORK FOR SEQUENCE-TO-SEQUENCE PRE-TRAINING

被引：2

作者：

Ihori, Mana ^{[1
]}

Makishima, Naoki ^{[1
]}

Tanaka, Tomohiro ^{[1
]}

Takashima, Akihiko ^{[1
]}

Orihashi, Shota ^{[1
]}

Masumura, Ryo ^{[1
]}

机构：

[1] NTT Corp, NTT Media Intelligence Labs, Tokyo, Japan

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

关键词：

sequence-to-sequence pre-training; pointer-generator networks; self-supervised learning; spoken-text normalization;

D O I：

10.1109/ICASSP39728.2021.9414738

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a self-supervised learning method for pointer-generator networks to improve spoken-text normalization. Spoken-text normalization that converts spoken-style text into style normalized text is becoming an important technology for improving subsequent processing such as machine translation and summarization. The most successful spoken-text normalization method to date is sequence-to-sequence (seq2seq) mapping using pointer-generator networks that possess a copy mechanism from an input sequence. However, these models require a large amount of paired data of spoken-style text and style normalized text, and it is difficult to prepare such a volume of data. In order to construct spoken-text normalization model from the limited paired data, we focus on self-supervised learning which can utilize unpaired text data to improve seq2seq models. Unfortunately, conventional self-supervised learning methods do not assume that pointer-generator networks are utilized. Therefore, we propose a novel self-supervised learning method, MAsked Pointer-Generator Network (MAPGN). The proposed method can effectively pre-train the pointer-generator network by learning to fill masked tokens using the copy mechanism. Our experiments demonstrate that MAPGN is more effective for pointer-generator networks than the conventional self-supervised learning methods in two spoken-text normalization tasks.

引用

页码：7563 / 7567

页数：5

共 50 条

[21] JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation
Mao, Zhuoyuan
Cromieres, Fabien
Dabre, Raj
Song, Haiyue
Kurohashi, Sadao
[J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3683 - 3691
[22] Entity Relations Based Pointer-Generator Network for Abstractive Text Summarization
Huang, Tiancheng
Lu, Guangquan
Li, Zexin
Song, Jiagang
Wu, Lijuan
[J]. ADVANCED DATA MINING AND APPLICATIONS, ADMA 2021, PT II, 2022, 13088 : 219 - 236
[23] Improving Pointer-Generator Network with Keywords Information for Chinese Abstractive Summarization
Jiang, Xiaoping
Hu, Po
Hou, Liwei
Wang, Xia
[J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT I, 2018, 11108 : 464 - 474
[24] A Fuzzy Training Framework for Controllable Sequence-to-Sequence Generation
Li, Jiajia
Wang, Ping
Li, Zuchao
Liu, Xi
Utiyama, Masao
Sumita, Eiichiro
Zhao, Hai
Ai, Haojun
[J]. IEEE ACCESS, 2022, 10 : 92467 - 92480
[25] Named Entity Transliteration with Sequence-to-Sequence Neural Network
Li, Zhongwei
Chng, Eng Siong
Li, Haizhou
[J]. 2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 374 - 378
[26] Automatic Generation of Source Code Comments Model Based on Pointer-generator Network
Niu, Chang-An
Ge, Ji-Dong
Tang, Ze
Li, Chuan-Yi
Zhou, Yu
Luo, Bin
[J]. Ruan Jian Xue Bao/Journal of Software, 2021, 32 (07): : 2142 - 2165
[27] BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
Nguyen Luong Tran
Duong Minh Le
Dat Quoc Nguyen
[J]. INTERSPEECH 2022, 2022, : 1751 - 1755
[28] Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation
Guo, Junliang
Xu, Linli
Chen, Enhong
[J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 376 - 385
[29] COUPLED TRAINING OF SEQUENCE-TO-SEQUENCE MODELS FOR ACCENTED SPEECH RECOGNITION
Unni, Vinit
Joshi, Nitish
Jyothi, Preethi
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8254 - 8258
[30] A Sequence-to-Sequence Framework Based on Transformer With Masked Language Model for Optical Music Recognition
Wen, Cuihong
Zhu, Longjiao
[J]. IEEE ACCESS, 2022, 10 : 118243 - 118252

← 1 2 3 4 5 →