PediCXR: An open, large-scale chest radiograph dataset for interpretation of common thoracic diseases in children

被引:0
|
作者
Hieu H. Pham
Ngoc H. Nguyen
Thanh T. Tran
Tuan N. M. Nguyen
Ha Q. Nguyen
机构
[1] Smart Health Center,
[2] VinBigData JSC,undefined
[3] College of Engineering & Computer Science,undefined
[4] VinUniversity,undefined
[5] VinUni-Illinois Smart Health Center,undefined
[6] Phu Tho Department of Health,undefined
[7] Training and Direction of Healthcare Activities Center,undefined
[8] Phu Tho General Hospital,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Computer-aided diagnosis systems in adult chest radiography (CXR) have recently achieved great success thanks to the availability of large-scale, annotated datasets and the advent of high-performance supervised learning algorithms. However, the development of diagnostic models for detecting and diagnosing pediatric diseases in CXR scans is undertaken due to the lack of high-quality physician-annotated datasets. To overcome this challenge, we introduce and release PediCXR, a new pediatric CXR dataset of 9,125 studies retrospectively collected from a major pediatric hospital in Vietnam between 2020 and 2021. Each scan was manually annotated by a pediatric radiologist with more than ten years of experience. The dataset was labeled for the presence of 36 critical findings and 15 diseases. In particular, each abnormal finding was identified via a rectangle bounding box on the image. To the best of our knowledge, this is the first and largest pediatric CXR dataset containing lesion-level annotations and image-level labels for the detection of multiple findings and diseases. For algorithm development, the dataset was divided into a training set of 7,728 and a test set of 1,397. To encourage new advances in pediatric CXR interpretation using data-driven approaches, we provide a detailed description of the PediCXR data sample and make the dataset publicly available on https://physionet.org/content/vindr-pcxr/1.0.0/.
引用
收藏
相关论文
共 50 条
  • [41] Validation of a Deep Learning Chest X-ray Interpretation Model: Integrating Large-Scale AI and Large Language Models for Comparative Analysis with ChatGPT
    Lee, Kyu Hong
    Lee, Ro Woon
    Kwon, Ye Eun
    DIAGNOSTICS, 2024, 14 (01)
  • [42] Consanguineous marriage and associated diseases among their children and grandchildren in India: evidence from large-scale data
    Kundu, Sampurna
    Jana, Arup
    JOURNAL OF BIOSOCIAL SCIENCE, 2024, 56 (04) : 796 - 808
  • [43] A Novel Recurrent Encoder-Decoder Structure for Large-Scale Multi-view Stereo Reconstruction from An Open Aerial Dataset
    Liu, Jin
    Ji, Shunping
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 6049 - 6058
  • [44] Peekbank: An open, large-scale repository for developmental eye-tracking data of children's word recognition
    Zettersten, Martin
    Yurovsky, Daniel
    Xu, Tian Linger
    Uner, Sarp
    Tsui, Angeline Sin Mei
    Schneider, Rose M.
    Saleh, Annissa N.
    Meylan, Stephan C.
    Marchman, Virginia A.
    Mankewitz, Jessica
    MacDonald, Kyle
    Long, Bria
    Lewis, Molly
    Kachergis, George
    Handa, Kunal
    DeMayo, Benjamin
    Carstensen, Alexandra
    Braginsky, Mika
    Boyce, Veronica
    Bhatt, Naiti S.
    Bergey, Claire Augusta
    Frank, Michael C.
    BEHAVIOR RESEARCH METHODS, 2023, 55 (05) : 2485 - 2500
  • [45] Peekbank: An open, large-scale repository for developmental eye-tracking data of children’s word recognition
    Martin Zettersten
    Daniel Yurovsky
    Tian Linger Xu
    Sarp Uner
    Angeline Sin Mei Tsui
    Rose M. Schneider
    Annissa N. Saleh
    Stephan C. Meylan
    Virginia A. Marchman
    Jessica Mankewitz
    Kyle MacDonald
    Bria Long
    Molly Lewis
    George Kachergis
    Kunal Handa
    Benjamin deMayo
    Alexandra Carstensen
    Mika Braginsky
    Veronica Boyce
    Naiti S. Bhatt
    Claire Augusta Bergey
    Michael C. Frank
    Behavior Research Methods, 2023, 55 : 2485 - 2500
  • [47] A Large-Scale Open Motion Dataset (KFall) and Benchmark Algorithms for Detecting Pre-impact Fall of the Elderly Using Wearable Inertial Sensors
    Yu, Xiaoqun
    Jang, Jaehyuk
    Xiong, Shuping
    FRONTIERS IN AGING NEUROSCIENCE, 2021, 13
  • [48] Transforming a Large-Scale Prostate Cancer Outcomes Dataset to the OMOP Common Data Model-Experiences from a Scientific Data Holder's Perspective
    Sibert, Nora Tabea
    Soff, Johannes
    La Ferla, Sebastiano
    Quaranta, Maria
    Kremer, Andreas
    Kowalski, Christoph
    CANCERS, 2024, 16 (11)
  • [49] A full-attention network with an open dataset for large-scale building semantic segmentation along long-span high-speed rail lines
    Qiao, Wenfan
    Shen, Li
    Wang, Jicheng
    Li, Zhilin
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2024, 17 (01)
  • [50] Will investments in large-scale prospective cohorts and biobanks limit our ability to discover weaker, less common genetic and environmental contributors to complex diseases?
    Foster, MW
    Sharp, RR
    ENVIRONMENTAL HEALTH PERSPECTIVES, 2005, 113 (02) : 119 - 122