Language features in extractive summarization: Humans Vs. Machines

被引:4
|
作者
Arroyo-Fernandez, Ignacio [1 ]
Curiel, Arturo [2 ]
Mendez-Cruz, Carlos-Francisco [3 ]
机构
[1] Univ Nacl Autonoma Mexico, Ciudad Univ, Mexico City, DF, Mexico
[2] Univ Veracruzana, CONACYT, Fac Estadist & Informat, Ave Xalapa Esq Manuel Avila Camacho S-N, Xalapa 91020, Veracruz, Mexico
[3] Univ Nacl Autonoma Mexico, Ctr Ciencias Genom, Ave Univ S-N, Cuernavaca 62100, Morelos, Mexico
关键词
Automatic text summarization; Statistical feature analysis; Natural language processing; Artificial intelligence; RELEVANCE CRITERIA;
D O I
10.1016/j.knosys.2019.05.014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a comparative statistical analysis of the language features most commonly used for Automatic Text Summarization (ATS), namely: Parts of Speech (PoS) (unigrams and bigrams), sentiments (by token and sentence), and Rhetorical Structure Theory (RTS) relations. The analyses were carried out on both human-made and machine-made summaries, in order to determine whether current ATS systems capture the same kind of information as humans do. Our results show that there are some marked differences between machine and human-made summaries, which at times may seem counterintuitive. For instance, named entities were usually frequent in machine-made summaries, but not in human-made ones. Similarly, words perceived to hold a "neutral" sentiment were systematically favored by machines, but not always by humans. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 50 条
  • [31] Frontiers: Machines vs. Humans: The Impact of Artificial Intelligence Chatbot Disclosure on Customer Purchases
    Luo, Xueming
    Tong, Siliang
    Fang, Zheng
    Qu, Zhe
    [J]. MARKETING SCIENCE, 2019, 38 (06) : 937 - 947
  • [32] Humans vs. Machines: Mechanical Compression Devices and Their Appropriate Application in the Management of Cardiac Arrest
    Clementi, Emilia
    Chitale, Anirudh
    O'Neil, Brian J.
    Lagina, Anthony T.
    [J]. CURRENT EMERGENCY AND HOSPITAL MEDICINE REPORTS, 2023, 11 (04) : 133 - 142
  • [33] Perception of Image Features in Post-Mortem Iris Recognition: Humans vs Machines
    Trokielewicz, Mateusz
    Czajka, Adam
    Maciejewicz, Piotr
    [J]. 2019 IEEE 10TH INTERNATIONAL CONFERENCE ON BIOMETRICS THEORY, APPLICATIONS AND SYSTEMS (BTAS), 2019,
  • [34] CAPTCHAs: Humans vs. Bots
    Kolupaev, Aleksey
    Ogijenko, Juriy
    [J]. IEEE SECURITY & PRIVACY, 2008, 6 (01) : 68 - 70
  • [35] Biomedical-domain pre-trained language model for extractive summarization
    Du, Yongping
    Li, Qingxiao
    Wang, Lulin
    He, Yanqing
    [J]. KNOWLEDGE-BASED SYSTEMS, 2020, 199
  • [36] Pre-trained language models with domain knowledge for biomedical extractive summarization
    Xie, Qianqian
    Bishop, Jennifer Amy
    Tiwari, Prayag
    Ananiadou, Sophia
    [J]. Knowledge-Based Systems, 2022, 252
  • [37] EFFECTIVE PSEUDO-RELEVANCE FEEDBACK FOR LANGUAGE MODELING IN EXTRACTIVE SPEECH SUMMARIZATION
    Liu, Shih-Hung
    Chen, Kuan-Yu
    Hsieh, Yu-Lun
    Chen, Berlin
    Wang, Hsin-Min
    Yen, Hsu-Chun
    Hsu, Wen-Lian
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [38] A Comparative Study of the Impact of Statistical and Semantic Features in the Framework of Extractive Text Summarization
    Vodolazova, Tatiana
    Lloret, Elena
    Munoz, Rafael
    Palomar, Manuel
    [J]. TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 306 - 313
  • [39] Extractive Summarization of Documents by Combining Semantic Content and Non-structured Features
    Yang, Shan
    Yang, Yating
    Mi, Chenggang
    Pan, Yirong
    Wang, Lei
    Ma, Bo
    [J]. 2018 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2018, : 279 - 284
  • [40] Exploiting surface, content and relevance features for learning-based extractive summarization
    Wu, Mingli
    Li, Wenjie
    Wei, Furu
    Lu, Qin
    Wong, Kam-Fai
    [J]. PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (NLP-KE'07), 2007, : 234 - +