PediCXR: An open, large-scale chest radiograph dataset for interpretation of common thoracic diseases in children

被引:0
|
作者
Hieu H. Pham
Ngoc H. Nguyen
Thanh T. Tran
Tuan N. M. Nguyen
Ha Q. Nguyen
机构
[1] Smart Health Center,
[2] VinBigData JSC,undefined
[3] College of Engineering & Computer Science,undefined
[4] VinUniversity,undefined
[5] VinUni-Illinois Smart Health Center,undefined
[6] Phu Tho Department of Health,undefined
[7] Training and Direction of Healthcare Activities Center,undefined
[8] Phu Tho General Hospital,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Computer-aided diagnosis systems in adult chest radiography (CXR) have recently achieved great success thanks to the availability of large-scale, annotated datasets and the advent of high-performance supervised learning algorithms. However, the development of diagnostic models for detecting and diagnosing pediatric diseases in CXR scans is undertaken due to the lack of high-quality physician-annotated datasets. To overcome this challenge, we introduce and release PediCXR, a new pediatric CXR dataset of 9,125 studies retrospectively collected from a major pediatric hospital in Vietnam between 2020 and 2021. Each scan was manually annotated by a pediatric radiologist with more than ten years of experience. The dataset was labeled for the presence of 36 critical findings and 15 diseases. In particular, each abnormal finding was identified via a rectangle bounding box on the image. To the best of our knowledge, this is the first and largest pediatric CXR dataset containing lesion-level annotations and image-level labels for the detection of multiple findings and diseases. For algorithm development, the dataset was divided into a training set of 7,728 and a test set of 1,397. To encourage new advances in pediatric CXR interpretation using data-driven approaches, we provide a detailed description of the PediCXR data sample and make the dataset publicly available on https://physionet.org/content/vindr-pcxr/1.0.0/.
引用
收藏
相关论文
共 50 条
  • [1] PediCXR: An open, large-scale chest radiograph dataset for interpretation of common thoracic diseases in children
    Pham, Hieu H.
    Nguyen, Ngoc H.
    Tran, Thanh T.
    Nguyen, Tuan N. M.
    Nguyen, Ha Q.
    SCIENTIFIC DATA, 2023, 10 (01)
  • [2] Zenseact Open Dataset: A large-scale and diverse multimodal dataset for autonomous driving
    Alibeigi, Mina
    Ljungbergh, William
    Tonderski, Adam
    Hess, Georg
    Lilja, Adam
    Lindstrom, Carl
    Motorniuk, Daria
    Fu, Junsheng
    Widahl, Jenny
    Petersson, Christoffer
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 20121 - 20131
  • [3] A Large-scale Dataset of (Open Source) License Text Variants
    Zacchiroli, Stefano
    2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022), 2022, : 757 - 761
  • [4] LSOIE: A Large-Scale Dataset for Supervised Open Information Extraction
    Solawetz, Jacob
    Larson, Stefan
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2595 - 2600
  • [5] OLKAVS: AN OPEN LARGE-SCALE KOREAN AUDIO-VISUAL SPEECH DATASET
    Park, Jeongkyun
    Hwang, Jung-Wook
    Choi, Kwanghee
    Lee, Seung-Hyeon
    Ahn, Jun Hwan
    Park, Rae-Hong
    Park, Hyung-Min
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6385 - 6389
  • [6] Talk Funny! A Large-Scale Humor Response Dataset with Chain-of-Humor Interpretation
    Chen, Yuyan
    Yuan, Yichen
    Liu, Panjun
    Liu, Dayiheng
    Guan, Qinghao
    Guo, Mengfei
    Peng, Haiming
    Liu, Bang
    Li, Zhixu
    Xiao, Yanghua
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17826 - 17834
  • [7] KOLOMVERSE: Korea Open Large-Scale Image Dataset for Object Detection in the Maritime Universe
    Nanda, Abhilasha
    Cho, Sung Won
    Lee, Hyeopwoo
    Park, Jin Hyoung
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, : 20832 - 20840
  • [8] SODA: A large-scale open site object detection dataset for deep learning in construction
    Duan, Rui
    Deng, Hui
    Tian, Mao
    Deng, Yichuan
    Lin, Jiarui
    AUTOMATION IN CONSTRUCTION, 2022, 142
  • [9] Name-Face Association in Web Videos: A Large-Scale Dataset,Baselines, and Open Issues
    陈智能
    杨宗桦
    张炜
    曹娟
    姜育刚
    Journal of Computer Science & Technology, 2014, 29 (05) : 785 - 798
  • [10] AgCNER, the First Large-Scale Chinese Named Entity Recognition Dataset for Agricultural Diseases and Pests
    Yao, Xiaochuang
    Hao, Xia
    Liu, Ruilin
    Li, Lin
    Guo, Xuchao
    SCIENTIFIC DATA, 2024, 11 (01)