Specialized Pre-Training of Neural Networks on Synthetic Data for Improving Paraphrase Generation

被引：0

作者：

Skurzhanskyi, O. H. ^{[1
]}

Marchenko, O. O. ^{[1
]}

Anisimov, A. V. ^{[1
]}

机构：

[1] Taras Shevchenko Natl Univ Kyiv, Kiev, Ukraine

来源：

CYBERNETICS AND SYSTEMS ANALYSIS | 2024年 / 60卷 / 02期

关键词：

artificial intelligence; machine learning; neural networks; paraphrase generation; pre-training; fine tuning;

D O I：

10.1007/s10559-024-00658-7

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Paraphrase generation is a fundamental problem in natural language processing. Due to the significant success of transfer learning, the "pre-training -> fine-tuning" approach has become the standard. However, popular general pre-training methods typically require extensive datasets and great computational resources, and the available pre-trained models are limited by fixed architecture and size. The authors have proposed a simple and efficient approach to pre-training specifically for paraphrase generation, which noticeably improves the quality of paraphrase generation and ensures substantial enhancement of general-purpose models. They have used existing public data and new data generated by large language models. The authors have investigated how this pre-training procedure impacts neural networks of various architectures and demonstrated its efficiency across all architectures.

引用

页码：167 / 174

页数：8

共 50 条

[1] Specialized Pre-Training of Neural Networks on Synthetic Data for Improving Paraphrase Generation
O. H. Skurzhanskyi
O. O. Marchenko
A. V. Anisimov
Cybernetics and Systems Analysis, 2024, 60 : 167 - 174
[2] Generative Pre-training for Paraphrase Generation by Representing and Predicting Spans in Exemplars
Bui, Tien-Cuong
Le, Van-Duc
To, Hai-Thien
Cha, Sang Kyun
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP 2021), 2021, : 83 - 90
[3] Pre-training on dynamic graph neural networks
Chen, Ke-Jia
Zhang, Jiajun
Jiang, Linpu
Wang, Yunyun
Dai, Yuxuan
NEUROCOMPUTING, 2022, 500 : 679 - 687
[4] Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data
Grundkiewicz, Roman
Junczys-Dowmunt, Marcin
Heafield, Kenneth
INNOVATIVE USE OF NLP FOR BUILDING EDUCATIONAL APPLICATIONS, 2019, : 252 - 263
[5] Synthetic Pre-Training Tasks for Neural Machine Translation
He, Zexue
Blackwood, Graeme
Panda, Rameswar
McAuley, Julian
Feris, Rogerio
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 8080 - 8098
[6] Neural Graph Matching for Pre-training Graph Neural Networks
Hou, Yupeng
Hu, Binbin
Zhao, Wayne Xin
Zhang, Zhiqiang
Zhou, Jun
Wen, Ji-Rong
PROCEEDINGS OF THE 2022 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2022, : 172 - 180
[7] Unsupervised Pre-training for Fully Convolutional Neural Networks
Wiehman, Stiaan
Kroon, Steve
de Villiers, Hendrik
2016 PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA AND ROBOTICS AND MECHATRONICS INTERNATIONAL CONFERENCE (PRASA-ROBMECH), 2016,
[8] PHGNN: Pre-Training Heterogeneous Graph Neural Networks
Li, Xin
Wei, Hao
Ding, Yu
IEEE ACCESS, 2024, 12 : 135411 - 135418
[9] Synthetic pre-training for neural-network interatomic potentials
Gardner, John L. A.
Baker, Kathryn T.
Deringer, Volker L.
MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (01):
[10] Unsupervised Pre-training on Improving the Performance of Neural Network in Regression
Salida, Pallabi
Vij, Prateek
Baruah, Rashmi Dutta
2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,

← 1 2 3 4 5 →