A metadata framework for computational phenotypes

被引:4
|
作者
Spotnitz, Matthew [1 ,16 ]
Acharya, Nripendra [1 ]
Cimino, James J. [2 ]
Murphy, Shawn [3 ,4 ]
Namjou, Bahram [5 ]
Crimmins, Nancy [5 ]
Walunas, Theresa [6 ]
Liu, Cong [1 ]
Crosslin, David [7 ]
Benoit, Barbara [8 ]
Rosenthal, Elisabeth [9 ]
Pacheco, Jennifer A. [10 ]
Ostropolets, Anna [1 ]
Reyes Nieva, Harry [1 ]
Patterson, Jason S. [1 ]
Richter, Lauren R. [1 ]
Callahan, Tiffany J. [1 ]
Elhussein, Ahmed [1 ]
Pang, Chao [1 ]
Kiryluk, Krzysztof [11 ]
Nestor, Jordan [11 ]
Khan, Atlas [11 ]
Mohan, Sumit [11 ,12 ]
Minty, Evan [13 ]
Chung, Wendy [14 ]
Wei, Wei-Qi [15 ]
Natarajan, Karthik [1 ]
Weng, Chunhua [1 ]
机构
[1] Columbia Univ, Irving Med Ctr, Vagelos Coll Phys & Surg, Dept Biomed Informat, New York, NY USA
[2] Univ Alabama Birmingham, Heersink Sch Med, Informat Inst, Birmingham, AL USA
[3] Mass Gen Brigham, Lab Comp Sci, Boston, MA USA
[4] Mass Gen Brigham, Dept Neurol, Boston, MA USA
[5] Cincinnati Childrens Hosp Med Ctr, Dept Pediat, Cincinnati, OH USA
[6] Northwestern Univ, Feinberg Sch Med, Dept Med, Chicago, IL USA
[7] Tulane Univ, Sch Med, Div Biomed Informat & Genom, New Orleans, LA USA
[8] Mass Gen Brigham, Dept Res Informat Sci & Comp, Boston, MA USA
[9] Univ Washington, Div Genet, Seattle, WA USA
[10] Northwestern Univ, Ctr Genet Med, Chicago, IL USA
[11] Columbia Univ, Irving Med Ctr, Vagelos Coll Phys & Surg, Dept Med,Div Nephrol, New York, NY USA
[12] Columbia Univ, Mailman Sch Publ Hlth, Dept Epidemiol, New York, NY USA
[13] Univ Calgary, Dept Med, Calgary, AB, Canada
[14] Columbia Univ, Irving Med Ctr, Vagelos Coll Phys & Surg, Dept Pediat, New York, NY USA
[15] Vanderbilt Univ, Dept Biomed Informat, Nashville, TN USA
[16] Columbia Univ, Irving Med Ctr, Dept Biomed Informat, 630 W 168th St, New York, NY 10032 USA
关键词
electronic health records; phenotype; metadata; ALGORITHMS; RECORDS;
D O I
10.1093/jamiaopen/ooad032
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Lay Summary Computational phenotypes are essential to scale precision medicine and observational healthcare research. More comprehensive and explicitly defined phenotype metadata could improve phenotype retrieval, reuse, and sharing. However, few studies have focused directly on phenotype metadata explicitness or validation methods and metrics. We designed a phenotype metadata framework as part of ongoing research with the Electronic Medical Records and Genomics (eMERGE) network phenotyping working group. We identified 39 metadata elements based on group consensus. We distributed a survey to 47 new researchers that rated the usefulness of each metadata element on a scale of 1-5, and conducted a thematic analysis of the free-text survey questions. Two researchers annotated 8 type-2 diabetes mellitus phenotypes with the framework. More than 90% of respondents assigned a rating of 4-5 to metadata framework elements regarding phenotype definition and validation metrics. In our thematic analysis, explicit descriptions, compliance with data standards, and comprehensive validation methods were strengths of the framework. Using a mixed-methods approach, we have developed a comprehensive framework for defining computational clinical phenotypes. Use of this framework may help curate patient data used for both observational and prospective healthcare research. With the burgeoning development of computational phenotypes, it is increasingly difficult to identify the right phenotype for the right tasks. This study uses a mixed-methods approach to develop and evaluate a novel metadata framework for retrieval of and reusing computational phenotypes. Twenty active phenotyping researchers from 2 large research networks, Electronic Medical Records and Genomics and Observational Health Data Sciences and Informatics, were recruited to suggest metadata elements. Once consensus was reached on 39 metadata elements, 47 new researchers were surveyed to evaluate the utility of the metadata framework. The survey consisted of 5-Likert multiple-choice questions and open-ended questions. Two more researchers were asked to use the metadata framework to annotate 8 type-2 diabetes mellitus phenotypes. More than 90% of the survey respondents rated metadata elements regarding phenotype definition and validation methods and metrics positively with a score of 4 or 5. Both researchers completed annotation of each phenotype within 60 min. Our thematic analysis of the narrative feedback indicates that the metadata framework was effective in capturing rich and explicit descriptions and enabling the search for phenotypes, compliance with data standards, and comprehensive validation metrics. Current limitations were its complexity for data collection and the entailed human costs.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Gateway standardization: A quality assurance framework for metadata
    Kelly, B
    Closier, A
    Hiom, D
    LIBRARY TRENDS, 2005, 53 (04) : 637 - 650
  • [42] Common Metadata Framework: Integrated Framework for Trustworthy Artificial Intelligence Pipelines
    Koomthanam, Annmary Justine
    Tripathy, Aalap
    Serebryakov, Sergey
    Nayak, Gyanaranjan
    Foltin, Martin
    Bhattacharya, Suparna
    IEEE INTERNET COMPUTING, 2024, 28 (03) : 37 - 44
  • [43] Computational Linguistics for Metadata Building (CUMB) text mining for the automatic extraction of subject terms for image metadata
    Klavans, Judith L.
    Sidhu, Tandeep
    Sheffield, Carolyn
    Soergel, Dagobert
    Lin, Jimmy
    Abels, Eileen
    Passonneau, Rebecca
    METADATA MINING FOR IMAGE UNDERSTANDING, 2008, : 3 - 12
  • [44] Metadata management, reuse, inference and propagation in a collection-oriented metadata framework for digital images
    Ku, William
    Kankanhalli, Mohan S.
    Lim, Joo-Hwee
    ADVANCES IN MULTIMEDIA MODELING, PT 2, 2007, 4352 : 145 - +
  • [45] A metadata reporting framework (FRAMES) for synthesis of ecohydrological observations
    Christianson, Danielle S.
    Varadharajan, Charuleka
    Christoffersen, Bradley
    Detto, Matteo
    Faybishenko, Boris
    Gimenez, Bruno O.
    Hendrix, Val
    Jardine, Kolby J.
    Negron-Juarez, Robinson
    Pastorello, Gilberto Z.
    Powell, Thomas L.
    Sandesh, Megha
    Warren, Jeffrey M.
    Wolfe, Brett T.
    Chambers, Jeffrey Q.
    Kueppers, Lara M.
    McDowell, Nathan G.
    Agarwal, Deborah A.
    ECOLOGICAL INFORMATICS, 2017, 42 : 148 - 158
  • [46] A Context Metadata Collection and Management Tool for Computational Photography Projects
    Schroer, Carla
    Mudge, Mark
    Leisch, Erich
    Doerr, Martin
    ARCHIVING 2017: FINAL PROGRAM AND PROCEEDINGS, 2017, : 99 - 104
  • [47] Using Characteristics of Computational Science Schemas for Workflow Metadata Management
    Jensen, Scott
    Plale, Beth
    IEEE CONGRESS ON SERVICES 2008, PT I, PROCEEDINGS, 2008, : 445 - 452
  • [48] COASTAL SCENARIOS DOCUMENTED WITH DIGITAL ATLASES - COMPUTATIONAL MODELING AND METADATA
    Lehfeldt, Rainer
    Milbradt, Peter
    Hoecker, Mario
    COASTAL ENGINEERING 2008, VOLS 1-5, 2009, : 4633 - +
  • [49] Framework for Scalable File System Metadata Crawling and Differencing
    Edi Shmueli
    Ilya Zaides
    Journal of Grid Computing, 2018, 16 : 445 - 457
  • [50] Study on framework of GIS-based model metadata
    Xie Gangsheng
    Xie Jianwen
    Yu Hailong
    GEOINFORMATICS 2008 AND JOINT CONFERENCE ON GIS AND BUILT ENVIRONMENT: ADVANCED SPATIAL DATA MODELS AND ANALYSES, PARTS 1 AND 2, 2009, 7146