Comparison of Full-text Articles and Abstracts for Visual Trend Analytics through Natural Language Processing

被引:4
|
作者
Nazemi, Kawa [1 ]
Klepsch, Maike J. [1 ]
Burkhardt, Dirk [1 ]
Kaupp, Lukas [1 ]
机构
[1] Darmstadt Univ Appl Sci, Human Comp Interact & Visual Analyt, Darmstadt, Germany
关键词
Visual Analytics; Data Science; Natural Language Processing; Visual Trend Analytics; ANALYZING TECHNOLOGICAL TRENDS;
D O I
10.1109/IV51561.2020.00065
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scientific publications are an essential resource for detecting emerging trends and innovations in a very early stage, by far earlier than patents may allow. Thereby Visual Analytics systems enable a deep analysis by applying commonly unsupervised machine learning methods and investigating a mass amount of data. A main question from the Visual Analytics viewpoint in this context is, do abstracts of scientific publications provide a similar analysis capability compared to their corresponding full-texts? This would allow to extract a mass amount of text documents in a much faster manner. We compare in this paper the topic extraction methods LSI and LDA by using full text articles and their corresponding abstracts to obtain which method and which data are better suited for a Visual Analytics system for Technology and Corporate Foresight. Based on a easy replicable natural language processing approach, we further investigate the impact of lemmatization for LDA and LSI. The comparison will be performed qualitative and quantitative to gather both, the human perception in visual systems and coherence values. Based on an application scenario a visual trend analytics system illustrates the outcomes.
引用
收藏
页码:360 / 367
页数:8
相关论文
共 44 条
  • [1] Using the full-text content of academic articles to identify and evaluate algorithm entities in the domain of natural language processing
    Wang, Yuzhuo
    Zhang, Chengzhi
    [J]. JOURNAL OF INFORMETRICS, 2020, 14 (04)
  • [2] Using the full-text content of academic articles to identify and evaluate algorithm entities in the domain of natural language processing
    Wang, Yuzhuo
    Zhang, Chengzhi
    [J]. Journal of Informetrics, 2020, 14 (04):
  • [3] A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts
    Westergaard, David
    Staerfeldt, Hans-Henrik
    Tonsberg, Christian
    Jensen, Lars Juhl
    Brunak, Soren
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2018, 14 (02)
  • [4] A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools
    Karin Verspoor
    Kevin Bretonnel Cohen
    Arrick Lanfranchi
    Colin Warner
    Helen L Johnson
    Christophe Roeder
    Jinho D Choi
    Christopher Funk
    Yuriy Malenkiy
    Miriam Eckert
    Nianwen Xue
    William A Baumgartner
    Michael Bada
    Martha Palmer
    Lawrence E Hunter
    [J]. BMC Bioinformatics, 13
  • [5] A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools
    Verspoor, Karin
    Cohen, Kevin Bretonnel
    Lanfranchi, Arrick
    Warner, Colin
    Johnson, Helen L.
    Roeder, Christophe
    Choi, Jinho D.
    Funk, Christopher
    Malenkiy, Yuriy
    Eckert, Miriam
    Xue, Nianwen
    Baumgartner, William A., Jr.
    Bada, Michael
    Palmer, Martha
    Hunter, Lawrence E.
    [J]. BMC BIOINFORMATICS, 2012, 13
  • [6] Comparison of conference abstracts and presentations with full-text articles in the health technology assessments of rapidly evolving technologies
    Dundar, Y
    Dodd, S
    Dickson, R
    Walley, T
    Haycox, A
    Williamson, PR
    [J]. HEALTH TECHNOLOGY ASSESSMENT, 2006, 10 (05) : III - +
  • [7] GetItFull - A tool for downloading and pre-processing full-text journal articles
    Natarajan, Jeyakumar
    Haines, Cliff
    Berglund, Brian
    DeSesa, Catherine
    Hack, Catherine J.
    Dubitzky, Werner
    Bremer, Eric G.
    [J]. KNOWLEDGE DISCOVERY IN LIFE SCIENCE LITERATURE, PROCEEDINGS, 2006, 3886 : 139 - 145
  • [8] Beyond genes, proteins, and abstracts: Identifying scientific claims from full-text biomedical articles
    Blake, Catherine
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2010, 43 (02) : 173 - 189
  • [9] Conversion Rates of Abstracts Presented at the Canadian Rheumatology Association Annual Meetings into Full-text Journal Articles
    Soong, Laura
    Yacyshyn, Elaine
    [J]. JOURNAL OF RHEUMATOLOGY, 2017, 44 (06) : 934 - 935
  • [10] Conversion rates of abstracts presented at the Canadian Rheumatology Association Annual Meetings into full-text journal articles
    Elaine A. Yacyshyn
    Laura C. Soong
    [J]. Rheumatology International, 2017, 37 : 949 - 953