Extracting Methodology Components from AI Research Papers: A Data-driven Factored Sequence Labeling Approach

被引:1
|
作者
Ghosh, Madhusudan [1 ]
Ganguly, Debasis [2 ]
Basuchowdhuri, Partha [1 ]
Naskar, Sudip Kumar [3 ]
机构
[1] Indian Assoc Cultivat Sci, Kolkata, India
[2] Univ Glasgow, Glasgow, Scotland
[3] Jadavpur Univ, Kolkata, India
关键词
Information Extraction; Factored Model; Clustering; Scientific Literature;
D O I
10.1145/3583780.3615258
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extraction of methodology component names from scientific articles is a challenging task due to the diversified contexts around the occurrences of these entities, and the different levels of granularity and containment relationships exhibited by these entities. We hypothesize that standard sequence labeling approaches may not adequately model the dependence of methodology name mentions with their contexts, due to the problems of their large, fast evolving, and domain-specific vocabulary. As a solution, we propose a factored approach, where the mention-context dependencies are represented in a more fine-grained manner, thus allowing the model parameters to better adjust to the different characteristic patterns inherent within the data. In particular, we experiment with two variants of this factored approach - one that uses the per-entity category information derived from an ontology, and the other that makes use of the topology of the sentence embedding space to infer a category for each entity constituting that sentence. We demonstrate that both these factored variants of SciBERT outperform their non-factored counterpart, a state-of-the-art model for scientific concept extraction.
引用
收藏
页码:3897 / 3901
页数:5
相关论文
共 50 条
  • [31] A data-driven approach for extracting and analyzing collaboration patterns at the interagent and intergroup levels in business process
    Wang, Shanshan
    Chen, Kun
    Liu, Zhiyong
    Guo, Ren-Yong
    Sun, Jianshan
    Dai, Qiongjie
    ELECTRONIC COMMERCE RESEARCH, 2019, 19 (02) : 451 - 470
  • [32] A data-driven decision support framework for DEA target setting: an explainable AI approach
    Rezaee, Mustafa Jahangoshai
    Onari, Mohsen Abbaspour
    Saberi, Morteza
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
  • [33] A DATA-DRIVEN APPROACH TO DETECT PRECIPITATION FROM METEOROLOGICAL SENSOR DATA
    Manandhar, Shilpa
    Dev, Soumyabrata
    Lee, Yee Hui
    Meng, Yu Song
    Winkler, Stefan
    IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 3872 - 3875
  • [34] Building an Interdisciplinary Team for Disaster Response Research: A Data-Driven Approach
    Ge, Yue Gurt
    Zobel, Christopher W.
    Murray-Tuite, Pamela
    Nateghi, Roshanak
    Wang, Haizhong
    RISK ANALYSIS, 2021, 41 (07) : 1145 - 1151
  • [35] A Data-Driven Approach for Discovering the Recent Research Status of Diabetes in China
    Chen, Xieling
    Weng, Heng
    Hao, Tianyong
    HEALTH INFORMATION SCIENCE (HIS 2017), 2017, 10594 : 89 - 101
  • [36] Data-driven methodology to quantify traffic resilience of communities from crowdsourced location data
    Contreras, Francisco
    Torres-Machi, Cristina
    INTERNATIONAL JOURNAL OF DISASTER RISK REDUCTION, 2025, 118
  • [38] Identification and Validation of Sensory-Active Compounds from Data-Driven Research: A Flavoromics Approach
    Ronningen, Ian
    Miller, Michelle
    Xia, Youlin
    Peterson, Devin G.
    JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY, 2018, 66 (10) : 2473 - 2479
  • [39] Beyond Traditional Flare Forecasting: A Data-driven Labeling Approach for High-fidelity Predictions
    Hong, Jinsu
    Ji, Anli
    Pandey, Chetraj
    Aydin, Berkay
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2023, 2023, 14148 : 380 - 385
  • [40] A Test-Driven Approach for Extracting Libraries of Reusable Components from Existing Applications
    Selim, Elaf
    Ghanam, Yaser
    Burns, Chris
    Seyed, Teddy
    Maurer, Frank
    AGILE PROCESSES IN SOFTWARE ENGINEERING AND EXTREME PROGRAMMING, 2011, 77 : 238 - 252