Deep-Learning-Based Pre-Training and Refined Tuning for Web Summarization Software

被引:0
|
作者
Liu, Mingyue [1 ]
Ma, Zhe [2 ]
Li, Jiale [3 ]
Wu, Ying Cheng [4 ]
Wang, Xukang [5 ]
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14850 USA
[2] Univ Southern Calif, Ming Hsieh Dept Elect & Comp Engn, Los Angeles, CA 90007 USA
[3] NYU, Tandon Sch Engn, New York, NY 10012 USA
[4] Univ Washington, Sch Law, Seattle, WA 98195 USA
[5] Sage IT Consulting Grp, Shanghai 200060, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Pre-training; deep learning; web information extraction;
D O I
10.1109/ACCESS.2024.3423662
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the digital age, the rapid growth of web information has made it increasingly challenging for individuals and organizations to effectively explore and extract valuable insights from the vast amount of information available. This paper presents a novel approach to automated web text summarization that combines advanced natural language processing techniques with recent breakthroughs in deep learning. we propose a dual-faceted technique that leverages extensive pre-training on a broad dataset outside the domain, followed by a unique refined tuning process. We introduce a carefully curated dataset that captures the heterogeneous nature of web articles and propose an innovative pre-training and tuning approach that establishes a new state-of-the-art in news summarization. Through extensive experiments and rigorous comparisons against existing models, we demonstrate the superiority of our method, particularly highlighting the crucial role of the refined tuning process in achieving these results. Through rigorous experimentation against state-of-the-art models, we demonstrate the superior performance of our approach, highlighting the significance of their refined tuning process in achieving these results.
引用
收藏
页码:92120 / 92129
页数:10
相关论文
共 50 条
  • [31] PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization
    Xiao, Wen
    Beltagy, Iz
    Carenini, Giuseppe
    Cohan, Arman
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5245 - 5263
  • [32] Deep Learning Pre-training Strategy for Mammogram Image Classification: an Evaluation Study
    Kadie Clancy
    Sarah Aboutalib
    Aly Mohamed
    Jules Sumkin
    Shandong Wu
    [J]. Journal of Digital Imaging, 2020, 33 : 1257 - 1265
  • [33] A New Pre-training Method for Training Deep Learning Models with Application to Spoken Language Understanding
    Celikyilmaz, Asli
    Sarikaya, Ruhi
    Hakkani-Tur, Dilek
    Liu, Xiaohu
    Ramesh, Nikhil
    Tur, Gokhan
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3255 - 3259
  • [34] GENERATION OF SYNTHETIC STRUCTURAL MAGNETIC RESONANCE IMAGES FOR DEEP LEARNING PRE-TRAINING
    Castro, Eduardo
    Ulloa, Alvaro
    Plis, Sergey M.
    Turner, Jessica A.
    Calhoun, Vince D.
    [J]. 2015 IEEE 12TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), 2015, : 1057 - 1060
  • [35] FOOD IMAGE RECOGNITION USING DEEP CONVOLUTIONAL NETWORK WITH PRE-TRAINING AND FINE-TUNING
    Yanai, Keiji
    Kawano, Yoshiyuki
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2015,
  • [36] Webformer: Pre-training with Web Pages for Information Retrieval
    Guo, Yu
    Ma, Zhengyi
    Mao, Jiaxin
    Qian, Hongjin
    Zhang, Xinyu
    Jiang, Hao
    Cao, Zhao
    Dou, Zhicheng
    [J]. PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 1502 - 1512
  • [37] Pre-training with asynchronous supervised learning for reinforcement learning based autonomous driving
    Wang, Yunpeng
    Zheng, Kunxian
    Tian, Daxin
    Duan, Xuting
    Zhou, Jianshan
    [J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2021, 22 (05) : 673 - 686
  • [38] Learning to Sample Replacements for ELECTRA Pre-Training
    Hao, Yaru
    Dong, Li
    Bao, Hangbo
    Xu, Ke
    Wei, Furu
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4495 - 4506
  • [39] Statistical Evaluation of a Commercial Deep-Learning-Based Automatic Contouring Software
    Bice, N.
    Patel, B.
    Milien, P.
    Mccarthy, A.
    Cheng, P.
    Rembish, J.
    Teruel, J.
    Barbee, D.
    [J]. MEDICAL PHYSICS, 2022, 49 (06) : E466 - E466
  • [40] Learning Chemical Rules of Retrosynthesis with Pre-training
    Jiang, Yinjie
    Wei, Ying
    Wu, Fei
    Huang, Zhengxing
    Kuang, Kun
    Wang, Zhihua
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 4, 2023, : 5113 - 5121