Whois? Deep Author Name Disambiguation Using Bibliographic Data

被引:7
|
作者
Boukhers, Zeyd [1 ,2 ]
Asundi, Nagaraj Bahubali [1 ]
机构
[1] Univ Koblenz Landau, Inst Web Sci & Technol WeST, Koblenz, Germany
[2] Fraunhofer Inst Appl Informat Technol, St Augustin, Germany
关键词
Author name disambiguation; Entity linkage; Bibliographic data; Neural networks; Classification;
D O I
10.1007/978-3-031-16802-4_16
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As the number of authors is increasing exponentially over years, the number of authors sharing the same names is increasing proportionally. This makes it challenging to assign newly published papers to their adequate authors. Therefore, Author Name Ambiguity (ANA) is considered a critical open problem in digital libraries. This paper proposes an Author Name Disambiguation (AND) approach that links author names to their real-world entities by leveraging their co-authors and domain of research. To this end, we use a collection from the DBLP repository that contains more than 5 million bibliographic records authored by around 2.6 million co-authors. Our approach first groups authors who share the same last names and same first name initials. The author within each group is identified by capturing the relation with his/her co-authors and area of research, which is represented by the titles of the validated publications of the corresponding author. To this end, we train a neural network model that learns from the representations of the co-authors and titles. We validated the effectiveness of our approach by conducting extensive experiments on a large dataset.
引用
收藏
页码:201 / 215
页数:15
相关论文
共 50 条
  • [31] The impact of imbalanced training data on machine learning for author name disambiguation
    Kim, Jinseok
    Kim, Jenna
    SCIENTOMETRICS, 2018, 117 (01) : 511 - 526
  • [32] The impact of imbalanced training data on machine learning for author name disambiguation
    Jinseok Kim
    Jenna Kim
    Scientometrics, 2018, 117 : 511 - 526
  • [33] Author name disambiguation of bibliometric data: A comparison of several unsupervised approaches
    Tekles, Alexander
    Bornmann, Lutz
    17TH INTERNATIONAL CONFERENCE ON SCIENTOMETRICS & INFORMETRICS (ISSI2019), VOL II, 2019, : 1548 - 1559
  • [34] Author name disambiguation of bibliometric data: A comparison of several unsupervised approaches
    Tekles, Alexander
    Bornmann, Lutz
    QUANTITATIVE SCIENCE STUDIES, 2020, 1 (04): : 1510 - 1528
  • [35] Data sets for author name disambiguation: an empirical analysis and a new resource
    Mark-Christoph Müller
    Florian Reitz
    Nicolas Roy
    Scientometrics, 2017, 111 : 1467 - 1500
  • [36] Incremental author name disambiguation using author profile models and self-citations
    Hussain, Ijaz
    Asghar, Sohail
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2019, 27 (05) : 3665 - 3681
  • [37] Multiple Features Driven Author Name Disambiguation
    Zhou, Qian
    Chen, Wei
    Wang, Weiqing
    Xu, Jiajie
    Zhao, Lei
    2021 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, ICWS 2021, 2021, : 506 - 515
  • [38] Author Name Disambiguation Based on Heterogeneous Graph
    Ma, Chuang
    Xia, Helong
    Journal of Computers (Taiwan), 2023, 34 (04) : 41 - 52
  • [39] Adaptive hyperparameter optimization for author name disambiguation
    Lu, Shuo
    Zhou, Yong
    JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2025,
  • [40] Semantic Author Name Disambiguation with Word Embeddings
    Mueller, Mark-Christoph
    RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES (TPDL 2017), 2017, 10450 : 300 - 311