Specialized Pre-Training of Neural Networks on Synthetic Data for Improving Paraphrase Generation

被引:0
|
作者
Skurzhanskyi, O. H. [1 ]
Marchenko, O. O. [1 ]
Anisimov, A. V. [1 ]
机构
[1] Taras Shevchenko Natl Univ Kyiv, Kiev, Ukraine
关键词
artificial intelligence; machine learning; neural networks; paraphrase generation; pre-training; fine tuning;
D O I
10.1007/s10559-024-00658-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Paraphrase generation is a fundamental problem in natural language processing. Due to the significant success of transfer learning, the "pre-training -> fine-tuning" approach has become the standard. However, popular general pre-training methods typically require extensive datasets and great computational resources, and the available pre-trained models are limited by fixed architecture and size. The authors have proposed a simple and efficient approach to pre-training specifically for paraphrase generation, which noticeably improves the quality of paraphrase generation and ensures substantial enhancement of general-purpose models. They have used existing public data and new data generated by large language models. The authors have investigated how this pre-training procedure impacts neural networks of various architectures and demonstrated its efficiency across all architectures.
引用
收藏
页码:167 / 174
页数:8
相关论文
共 50 条
  • [1] Specialized Pre-Training of Neural Networks on Synthetic Data for Improving Paraphrase Generation
    O. H. Skurzhanskyi
    O. O. Marchenko
    A. V. Anisimov
    Cybernetics and Systems Analysis, 2024, 60 : 167 - 174
  • [2] Generative Pre-training for Paraphrase Generation by Representing and Predicting Spans in Exemplars
    Bui, Tien-Cuong
    Le, Van-Duc
    To, Hai-Thien
    Cha, Sang Kyun
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP 2021), 2021, : 83 - 90
  • [3] Pre-training on dynamic graph neural networks
    Chen, Ke-Jia
    Zhang, Jiajun
    Jiang, Linpu
    Wang, Yunyun
    Dai, Yuxuan
    NEUROCOMPUTING, 2022, 500 : 679 - 687
  • [4] Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data
    Grundkiewicz, Roman
    Junczys-Dowmunt, Marcin
    Heafield, Kenneth
    INNOVATIVE USE OF NLP FOR BUILDING EDUCATIONAL APPLICATIONS, 2019, : 252 - 263
  • [5] Synthetic Pre-Training Tasks for Neural Machine Translation
    He, Zexue
    Blackwood, Graeme
    Panda, Rameswar
    McAuley, Julian
    Feris, Rogerio
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 8080 - 8098
  • [6] Neural Graph Matching for Pre-training Graph Neural Networks
    Hou, Yupeng
    Hu, Binbin
    Zhao, Wayne Xin
    Zhang, Zhiqiang
    Zhou, Jun
    Wen, Ji-Rong
    PROCEEDINGS OF THE 2022 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2022, : 172 - 180
  • [7] Unsupervised Pre-training for Fully Convolutional Neural Networks
    Wiehman, Stiaan
    Kroon, Steve
    de Villiers, Hendrik
    2016 PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA AND ROBOTICS AND MECHATRONICS INTERNATIONAL CONFERENCE (PRASA-ROBMECH), 2016,
  • [8] PHGNN: Pre-Training Heterogeneous Graph Neural Networks
    Li, Xin
    Wei, Hao
    Ding, Yu
    IEEE ACCESS, 2024, 12 : 135411 - 135418
  • [9] Synthetic pre-training for neural-network interatomic potentials
    Gardner, John L. A.
    Baker, Kathryn T.
    Deringer, Volker L.
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (01):
  • [10] Unsupervised Pre-training on Improving the Performance of Neural Network in Regression
    Salida, Pallabi
    Vij, Prateek
    Baruah, Rashmi Dutta
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,