Information Extraction from Nanotoxicity Related Publications

被引:0
|
作者
Xiao, Lemin [1 ]
Tang, Kaizhi [1 ]
Liu, Xiong [1 ]
Yang, Hui [1 ]
Chen, Zheng [1 ]
Xu, Roger [1 ]
机构
[1] Intelligent Automat Inc, Rockville, MD 20855 USA
关键词
Nanoinformatics; information extraction; named entity recognition; relation extraction; nanotoxicity; data mining;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
High-quality experimental data are important when developing predictive models for studying nanomaterial environmental impact (NEI). Given that raw data from experimental laboratories and manufacturing workplaces are usually proprietary and small-scaled, extracting information from publications is an attractive alternative for collecting data. We developed an information extraction system that can extract useful information from full-text nanotoxicity related publications. This information extraction system consists of five components: raw data transformation into machine readable format, data preprocessing, ontology-based named entity recognition, rule-based numerical attribute extraction from both tables and unstructured text, and relation extraction among entities and attributes. The information extraction system is applied on a dataset made of 94 publications, and results in an acceptable accuracy. By storing extracted data into a table according to relations among the data, a dataset that can be used to predict nanomaterial environmental impact is obtained. Such a system is unique in current nanomaterial community, and can help nanomaterial scientists and practitioners quickly locate useful information they need without spending lots of time reading articles.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] GROBID - Information Extraction from Scientific Publications
    Lopez, Patrice
    Romary, Laurent
    ERCIM NEWS, 2015, (100): : 41 - 42
  • [2] HyperPIE: Hyperparameter Information Extraction from Scientific Publications
    Saier, Tarek
    Ohta, Mayumi
    Asakura, Takuto
    Faerber, Michael
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT II, 2024, 14609 : 254 - 269
  • [3] Ontology-Driven Information Extraction from Research Publications
    Pertsas, Vayianos
    Constantopoulos, Panos
    DIGITAL LIBRARIES FOR OPEN KNOWLEDGE, TPDL 2018, 2018, 11057 : 241 - 253
  • [4] Automated Extraction of Symptoms related to Rare Diseases from Scientific Publications
    Cousyn, Charles
    Bouchard, Kevin
    Bouchard, Bruno
    Gaboury, Sebastien
    GOODTECHS '18: PROCEEDINGS OF THE 4TH EAI INTERNATIONAL CONFERENCE ON SMART OBJECTS AND TECHNOLOGIES FOR SOCIAL GOOD (GOODTECHS), 2018, : 13 - 18
  • [5] Information extraction: New developments in astronomical information retrieval for electronic publications
    Lesteven, S
    Bonnarel, F
    Dubois, P
    Egret, D
    Fernique, P
    Geneva, F
    Murtagh, F
    Ochsenbein, F
    Wenger, M
    LIBRARY AND INFORMATION SERVICES IN ASTRONOMY III (LISA III), 1998, 153 : 61 - 68
  • [6] Automated Extraction of Information From Texts of Scientific Publications: Insights Into HIV Treatment Strategies
    Biziukova, Nadezhda
    Tarasova, Olga
    Ivanov, Sergey
    Poroikov, Vladimir
    FRONTIERS IN GENETICS, 2020, 11
  • [7] Data Mining Approach for Extraction of Useful Information About Biologically Active Compounds from Publications
    Tarasova, Olga A.
    Biziukova, Nadezhda Yu
    Filimonov, Dmitry A.
    Poroikov, Vladimir V.
    Nicklaus, Marc C.
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2019, 59 (09) : 3635 - 3644
  • [8] Human Related Information Extraction from Chinese Archive Images
    Jin, Xin
    Yin, Hangbing
    Chen, Xiaoyu
    Bi, Huimin
    Xiao, Chaoen
    Liu, Yijian
    ARTIFICIAL INTELLIGENCE AND ROBOTICS, ISAIR 2023, 2024, 1998 : 139 - 146
  • [9] Key Relation Extraction from Biomedical Publications
    Huang, Lan
    Wang, Ye
    Gong, Leiguang
    Kulikowski, Casimir
    Bai, Tian
    MEDINFO 2017: PRECISION HEALTHCARE THROUGH INFORMATICS, 2017, 245 : 873 - 877
  • [10] Knowledge Extraction and Modeling from Scientific Publications
    Ronzano, Francesco
    Saggion, Horacio
    SEMANTICS, ANALYTICS, VISUALIZATION: ENHANCING SCHOLARLY DATA, SAVE-SD 2016, 2016, 9792 : 11 - 25