Leveraging Adversarial Training to Facilitate Grammatical Error Correction

被引：0

作者：

Dang, Kai ^{[1
]}

Xie, Jiaying ^{[1
]}

Liu, Jie ^{[1
]}

机构：

[1] Nankai Univ, Coll Artificial Intelligence, Tianjin, Peoples R China

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT I | 2021年 / 12891卷

基金：

中国国家自然科学基金;

关键词：

Grammatical error correction; Adversarial training; Consistency constraint;

D O I：

10.1007/978-3-030-86362-3_6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Grammatical error correction (GEC) task aims to detect and correct grammatical errors in sentences. Recently, the pre-trained language model has provided a strong baseline for GEC and achieved excellent results by fine-tuning on a small amount of annotated data. However, due to the lack of large-scale erroneous-corrected parallel datasets, these models tend to suffer from the problem of overfitting. Previous researchers have proposed a variety of data augmentation methods to generate more training data and enlarge the dataset, but these methods either rely on rules to generate grammatical errors and are not automated, or produce errors that do not match human writing errors. The pre-trained model only improves significantly after task-specific data fine-tuning; otherwise, the highly noisy data can impair the performance of the pre-trained model. To address this issue, we propose a method to enhance the robustness of the model based on adversarial training. This approach constructs the adversarial samples and treats them as the augmented data. Unlike previous methods that introduce token-level noise, our method introduces embedding-level noise and can obtain extra samples that are close to human writing errors. Besides, we employ the adversarial consistency constraint to reduce the gap between the adversarial sample and the original sample. The experimental results demonstrate that our method can further boost the performance of the pre-trained model on GEC task.

引用

页码：67 / 78

页数：12

共 50 条

[1] Adversarial Grammatical Error Correction
Raheja, Vipul
Alikaniotis, Dimitris
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020,
[2] Data Weighted Training Strategies for Grammatical Error Correction
Lichtarge, Jared
Alberti, Chris
Kumar, Shankar
[J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 : 634 - 646
[3] Improving Grammatical Error Correction Models with Purpose-Built Adversarial Examples
Wang, Lihao
Zheng, Xiaoqing
[J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 2858 - 2869
[4] Efficient Grammatical Error Correction with Hierarchical Error Detections and Correction
Pan, Fayu
Cao, Bin
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, ICWS 2021, 2021, : 525 - 530
[5] A Chinese Grammatical Error Correction Method Based on Iterative Training and Sequence Tagging
Kuang, Hailan
Wu, Kewen
Ma, Xiaolin
Liu, Xinhua
[J]. APPLIED SCIENCES-BASEL, 2022, 12 (09):
[6] A Comprehensive Survey of Grammatical Error Correction
Wang, Yu
Wang, Yuelin
Dang, Kai
Liu, Jie
Liu, Zhuo
[J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2021, 12 (05)
[7] Towards Lithuanian Grammatical Error Correction
Stankevicius, Lukas
Lukosevicius, Mantas
[J]. ARTIFICIAL INTELLIGENCE TRENDS IN SYSTEMS, VOL 2, 2022, 502 : 490 - 503
[8] Corpora Generation for Grammatical Error Correction
Lichtarge, Jared
Alberti, Chris
Kumar, Shankar
Shazeer, Noam
Parmar, Niki
Tong, Simon
[J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3291 - 3301
[9] Spoken Language 'Grammatical Error Correction'
Lu, Yiting
Gales, Mark J. F.
Wang, Yu
[J]. INTERSPEECH 2020, 2020, : 3840 - 3844
[10] Grammatical Error Correction with Denoising Autoencoder
Pajak, Krzysztof
Gonczarek, Adam
[J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (08) : 821 - 826

← 1 2 3 4 5 →