Coreference analysis in clinical notes: a multi-pass sieve with alternate anaphora resolution modules

被引:19
|
作者
Jonnalagadda, Siddhartha Reddy [1 ]
Li, Dingcheng [1 ]
Sohn, Sunghwan [1 ]
Wu, Stephen Tze-Inn [1 ]
Wagholikar, Kavishwar [1 ]
Torii, Manabu [2 ]
Liu, Hongfang [1 ]
机构
[1] Mayo Clin, Dept Hlth Sci Res, Rochester, MI USA
[2] Univ Delaware, Dept Comp & Informat Sci, Newark, DE 19716 USA
基金
美国国家科学基金会;
关键词
EXTRACTION;
D O I
10.1136/amiajnl-2011-000766
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective This paper describes the coreference resolution system submitted by Mayo Clinic for the 2011 i2b2/VA/Cincinnati shared task Track 1C. The goal of the task was to construct a system that links the markables corresponding to the same entity. Materials and methods The task organizers provided progress notes and discharge summaries that were annotated with the markables of treatment, problem, test, person, and pronoun. We used a multi-pass sieve algorithm that applies deterministic rules in the order of preciseness and simultaneously gathers information about the entities in the documents. Our system, MedCoref, also uses a state-of-the-art machine learning framework as an alternative to the final, rule-based pronoun resolution sieve. Results The best system that uses a multi-pass sieve has an overall score of 0.836 (average of B-3, MUC, Blanc, and CEAF F score) for the training set and 0.843 for the test set. Discussion A supervised machine learning system that typically uses a single function to find coreferents cannot accommodate irregularities encountered in data especially given the insufficient number of examples. On the other hand, a completely deterministic system could lead to a decrease in recall (sensitivity) when the rules are not exhaustive. The sieve-based framework allows one to combine reliable machine learning components with rules designed by experts. Conclusion Using relatively simple rules, part-of-speech information, and semantic type properties, an effective coreference resolution system could be designed. The source code of the system described is available at https://sourceforge.net/projects/ohnlp/files/MedCoref.
引用
收藏
页码:867 / 874
页数:8
相关论文
共 50 条
  • [1] Multi-pass Sieve Coreference Resolution System for Polish
    Niton, Bartlomiej
    Ogrodniczuk, Maciej
    [J]. LANGUAGE, DATA, AND KNOWLEDGE, LDK 2017, 2017, 10318 : 222 - 236
  • [2] An Event-oriented Multi-Pass Sieve Module for Coreference Resolution
    Li, Qiang
    Liu, Zongtian
    Chen, Lei
    Wang, Xianchuan
    [J]. 2015 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (ISKE), 2015, : 203 - 207
  • [3] Event Coreference Resolution with Multi-Pass Sieves
    Lu, Jing
    Ng, Vincent
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3996 - 4003
  • [4] A Multi-pass Sieve for Clinical Concept Normalization
    Wang, Yuxia
    Hur, Brian
    Verspoor, Karin
    Baldwin, Timothy
    [J]. TRAITEMENT AUTOMATIQUE DES LANGUES, 2020, 61 (02): : 41 - 65
  • [5] A Multi-Pass Sieve for Name Normalization
    D'Souza, Jennifer
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 4150 - 4151
  • [6] A multi-pass interferometer for high resolution angular measurement
    Li, Wei
    Gao, Sitian
    Shi, Yushu
    Li, Qi
    Li, Shi
    Huang, Lu
    [J]. NINTH INTERNATIONAL SYMPOSIUM ON PRECISION MECHANICAL MEASUREMENTS, 2019, 11343
  • [7] A Multi-Pass GAN for Fluid Flow Super-Resolution
    Werhahn, Maximilian
    Xie, You
    Chu, Mengyu
    Thuerey, Nils
    [J]. PROCEEDINGS OF THE ACM ON COMPUTER GRAPHICS AND INTERACTIVE TECHNIQUES, 2019, 2 (02)
  • [8] Range resolution limits in multi-pass SAR data processing
    Fornaro, G
    Pascazio, V
    Schirinzi, G
    [J]. IGARSS 2002: IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM AND 24TH CANADIAN SYMPOSIUM ON REMOTE SENSING, VOLS I-VI, PROCEEDINGS: REMOTE SENSING: INTEGRATING OUR VIEW OF THE PLANET, 2002, : 182 - 184
  • [9] Analysis of a SCALPEL™ multi-pass writing strategy
    Zhu, X
    Munro, E
    Rouse, JA
    Liu, H
    Waskiewicz, WK
    [J]. MICROELECTRONIC ENGINEERING, 2000, 53 (1-4) : 321 - 324
  • [10] Lexical patterns, features and knowledge resources for coreference resolution in clinical notes
    Gooch, Phil
    Roudsari, Abdul
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2012, 45 (05) : 901 - 912