Chia, a large annotated corpus of clinical trial eligibility criteria

被引:18
|
作者
Kury, Fabricio [1 ]
Butler, Alex [1 ]
Yuan, Chi [1 ]
Fu, Li-heng [1 ]
Sun, Yingcheng [1 ]
Liu, Hao [1 ,2 ]
Sim, Ida [3 ]
Carini, Simona [3 ]
Weng, Chunhua [1 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
[2] New Jersey Inst Technol, Newark, NJ 07102 USA
[3] Univ Calif San Francisco, San Francisco, CA 94143 USA
关键词
REPRESENTATION; EXTRACTION;
D O I
10.1038/s41597-020-00620-0
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We present Chia, a novel, large annotated corpus of patient eligibility criteria extracted from 1,000 interventional, Phase IV clinical trials registered in ClinicalTrials.gov. This dataset includes 12,409 annotated eligibility criteria, represented by 41,487 distinctive entities of 15 entity types and 25,017 relationships of 12 relationship types. Each criterion is represented as a directed acyclic graph, which can be easily transformed into Boolean logic to form a database query. Chia can serve as a shared benchmark to develop and test future machine learning, rule-based, or hybrid methods for information extraction from free-text clinical trial eligibility criteria.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Chia, a large annotated corpus of clinical trial eligibility criteria
    Fabrício Kury
    Alex Butler
    Chi Yuan
    Li-heng Fu
    Yingcheng Sun
    Hao Liu
    Ida Sim
    Simona Carini
    Chunhua Weng
    [J]. Scientific Data, 7
  • [2] The Leaf Clinical Trials Corpus: a new resource for query generation from clinical trial eligibility criteria
    Dobbins, Nicholas J.
    Mullen, Tony
    Uzuner, Ozlem
    Yetisgen, Meliha
    [J]. SCIENTIFIC DATA, 2022, 9 (01)
  • [3] The Leaf Clinical Trials Corpus: a new resource for query generation from clinical trial eligibility criteria
    Nicholas J. Dobbins
    Tony Mullen
    Özlem Uzuner
    Meliha Yetisgen
    [J]. Scientific Data, 9
  • [4] Towards Phenotyping of Clinical Trial Eligibility Criteria
    Loebe, Matthias
    Staeubert, Sebastian
    Goldberg, Colleen
    Haffner, Ivonne
    Winter, Alfred
    [J]. HEALTH INFORMATICS MEETS EHEALTH: BIOMEDICAL MEETS EHEALTH - FROM SENSORS TO DECISIONS, 2018, 248 : 293 - 299
  • [5] A knowledge base of clinical trial eligibility criteria
    Liu, Hao
    Chi, Yuan
    Butler, Alex
    Sun, Yingcheng
    Weng, Chunhua
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2021, 117
  • [6] Clinical Trial Eligibility Criteria: A Structural Barrier to Diversity in Clinical Trial Enrollment
    Snyder, Rebecca A.
    [J]. JOURNAL OF CLINICAL ONCOLOGY, 2022, 40 (20) : 2183 - +
  • [7] Patterns for Conflict Identification in Clinical Trial Eligibility Criteria
    MacKellar, Bonnie
    Schweikert, Christina
    [J]. 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATIONS AND SERVICES (HEALTHCOM), 2016, : 568 - 573
  • [8] Enhancing Arden Syntax for clinical trial eligibility criteria
    Wang, SJ
    Ohno-Machado, L
    Mar, P
    Boxwala, AA
    Greenes, RA
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 1999, : 1188 - 1188
  • [9] AutoCriteria: a generalizable clinical trial eligibility criteria extraction system powered by large language models
    Datta, Surabhi
    Lee, Kyeryoung
    Paek, Hunki
    Manion, Frank J.
    Ofoegbu, Nneka
    Du, Jingcheng
    Li, Ying
    Huang, Liang-Chin
    Wang, Jingqi
    Lin, Bin
    Xu, Hua
    Wang, Xiaoyan
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (02) : 375 - 385
  • [10] Investigating clinical trial eligibility criteria to improve MatchMiner trial matching
    Klein, Harry
    Mazor, Tali
    Galvin, Matthew
    Hansel, Jason
    Mallaber, Emily
    Trukhanov, Pavel
    Provencher, James
    Lindsay, James
    Hassett, Michael
    Cerami, Ethan
    [J]. CANCER RESEARCH, 2024, 84 (06)