Information Extraction from Nanotoxicity Related Publications

被引:0
|
作者
Xiao, Lemin [1 ]
Tang, Kaizhi [1 ]
Liu, Xiong [1 ]
Yang, Hui [1 ]
Chen, Zheng [1 ]
Xu, Roger [1 ]
机构
[1] Intelligent Automat Inc, Rockville, MD 20855 USA
关键词
Nanoinformatics; information extraction; named entity recognition; relation extraction; nanotoxicity; data mining;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
High-quality experimental data are important when developing predictive models for studying nanomaterial environmental impact (NEI). Given that raw data from experimental laboratories and manufacturing workplaces are usually proprietary and small-scaled, extracting information from publications is an attractive alternative for collecting data. We developed an information extraction system that can extract useful information from full-text nanotoxicity related publications. This information extraction system consists of five components: raw data transformation into machine readable format, data preprocessing, ontology-based named entity recognition, rule-based numerical attribute extraction from both tables and unstructured text, and relation extraction among entities and attributes. The information extraction system is applied on a dataset made of 94 publications, and results in an acceptable accuracy. By storing extracted data into a table according to relations among the data, a dataset that can be used to predict nanomaterial environmental impact is obtained. Such a system is unique in current nanomaterial community, and can help nanomaterial scientists and practitioners quickly locate useful information they need without spending lots of time reading articles.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] An approach to the extraction of preference-related information from design team language
    Ji, Haifeng
    Yang, Maria C.
    Honda, Tomonori
    RESEARCH IN ENGINEERING DESIGN, 2012, 23 (02) : 85 - 103
  • [22] Extraction of Event-Related Information from Text for the Representation of Cultural Heritage
    Ntafotis, Emmanouil
    Zidianakis, Emmanouil
    Partarakis, Nikolaos
    Stephanidis, Constantine
    HERITAGE, 2022, 5 (04): : 3374 - 3396
  • [23] Extraction and Visualization of Occupational Health and Safety Related Information from Open Web
    Dasgupta, Tirthankar
    Naskar, Abir
    Saha, Rupsa
    Dey, Lipika
    2018 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2018), 2018, : 434 - 439
  • [24] Mining domain ontological information from online publications
    Huang, Chung-Yuan
    Sun, Chuen-Tsai
    Shih, Fu-Ming
    Hsieh, Ji-Lung
    PROCEEDINGS OF THE 6TH WSEAS INTERNATIONAL CONFERENCE ON E-ACTIVITIES: E-ACTIVITIES: NETWORKING THE WORLD, 2007, : 153 - +
  • [25] Information extraction from voicemail
    Huang, J
    Zweig, G
    Padmanabhan, M
    39TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2001, : 290 - 297
  • [26] INFORMATION EXTRACTION FROM SPEECH
    MURPHY, AJ
    RATCLIFFE, NW
    JOHNSON, DAH
    DEWHURST, DJ
    ANNALS OF OTOLOGY RHINOLOGY AND LARYNGOLOGY, 1987, 96 (01): : 69 - 71
  • [27] Information Extraction from Invoices
    Hamdi, Ahmed
    Carel, Elodie
    Joseph, Aurelie
    Coustaty, Mickael
    Doucet, Antoine
    DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 699 - 714
  • [28] Automatic extraction of metadata from scientific publications for CRIS systems
    Kovacevic, Aleksandar
    Ivanovic, Dragan
    Milosavljevic, Branko
    Konjovic, Zora
    Surla, Dusan
    PROGRAM-ELECTRONIC LIBRARY AND INFORMATION SYSTEMS, 2011, 45 (04) : 376 - 396
  • [29] Converting Nanotoxicity Data to Information Using Artificial Intelligence and Simulation
    Yan, Xiliang
    Yue, Tongtao
    Winkler, David A. A.
    Yin, Yongguang
    Zhu, Hao
    Jiang, Guibin
    Yan, Bing
    CHEMICAL REVIEWS, 2023, 123 (13) : 8575 - 8637
  • [30] PUBLICATIONS AND THE INFORMATION REVOLUTION
    KNECHT, TW
    ILLINOIS RESEARCH, 1981, 23 (3-4): : 30 - 32