Identification and classification of research data cited in scholarly papers

被引:0
|
作者
Tsunokake M. [1 ]
Matsubara S. [2 ]
机构
[1] Graduate School of Informatics, Nagoya University Furo-cho, Chikusa-ku, Nagoya
[2] Information and Communications, Nagoya University Furo-cho, Chikusa-ku, Nagoya
关键词
Distributed representations; Information extraction; Open science; Repository; Research data; URL;
D O I
10.1541/ieejeiss.140.1357
中图分类号
学科分类号
摘要
This paper proposes a method for identifying and classifying the research data cited in scholarly papers, aiming at automatic generation of metadata stored in data repository. This study focuses on URL citations in the scholarly papers. That is, the targets are to identify the URLs referring to the research data and to classify them into tool and data. The method is realized as a multi-class classification (tool/data/others). The method acquires the distributed representations of the URLs from the context around them, and uses them as the input feature. There exists an advantage in that the meanings of URLs can be given based on their surrounding words. This study adopts an approach of computing the meaning of the entire URL from those of the components of the URL. In order to evaluate the performance of the proposed method, experiments on URL classification were conducted. The scholarly papers included in the proceedings of the international conference were used as experimental data. Experimental results have shown the effectiveness of the proposed method for identifying and classifying URLs referring to research data. © 2020 The Institute of Electrical Engineers of Japan.
引用
收藏
页码:1357 / 1364
页数:7
相关论文
共 50 条
  • [1] Venue Classification of Research Papers in Scholarly Digital Libraries
    Caragea, Cornelia
    Florescu, Corina
    DIGITAL LIBRARIES FOR OPEN KNOWLEDGE, TPDL 2018, 2018, 11057 : 129 - 136
  • [2] How research data is cited in scholarly literature: A case study of HINTS
    Yoon, JungWon
    Chung, EunKyung
    Lee, Jae Yun
    Kim, Jihyun
    LEARNED PUBLISHING, 2019, 32 (03) : 199 - 206
  • [3] Identification of Scholarly Papers and Authors
    Baba, Kensuke
    Mori, Masao
    Ito, Eisuke
    NETWORKED DIGITAL TECHNOLOGIES, 2011, 136 : 195 - 202
  • [4] Using Citation Contexts in Scholarly Papers for Research Data Search
    Tsunokake, Masaya
    Matsubara, Shigeki
    16TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2021), 2021,
  • [6] SOURCES OF LITERATURE CITED IN WILDLIFE RESEARCH PAPERS
    HEIN, D
    JOURNAL OF WILDLIFE MANAGEMENT, 1967, 31 (03): : 598 - &
  • [7] The Most Cited Papers in Osteoporosis and Related Research
    Holzer, Lukas A.
    Leithner, Andreas
    Holzer, Gerold
    JOURNAL OF OSTEOPOROSIS, 2015, 2015
  • [8] Identification of the most important external features of highly cited scholarly papers through 3 (i.e., Ridge, Lasso, and Boruta) feature selection data mining methods: Identification of the most important external features of highly cited scholarly papers through 3 (i.e., Ridge, Lasso, and Boruta) feature selection data mining methods
    Fahimifar S.
    Mousavi K.
    Mozaffari F.
    Ausloos M.
    Quality & Quantity, 2023, 57 (4) : 3685 - 3712
  • [9] Highly cited papers in rheumatology: identification and conceptual analysis
    Veronica Perez-Cabezas
    Carmen Ruiz-Molinero
    Ines Carmona-Barrientos
    Enrique Herrera-Viedma
    Manuel J. Cobo
    Jose A. Moral-Munoz
    Scientometrics, 2018, 116 : 555 - 568
  • [10] Achieving human and machine accessibility of cited data in scholarly publications
    Starr, Joan
    Castro, Eleni
    Crosas, Merce
    Dumontier, Michel
    Downs, Robert R.
    Duerr, Ruth
    Haak, Laurel L.
    Haendel, Melissa
    Herman, Ivan
    Hodson, Simon
    Hourcle, Joe
    Kratz, John Ernest
    Lin, Jennifer
    Nielsen, Lars Holm
    Nurnberger, Amy
    Proell, Stefan
    Rauber, Andreas
    Sacchi, Simone
    Smith, Arthur
    Taylor, Mike
    Clark, Tim
    PEERJ COMPUTER SCIENCE, 2015,