ENRICHing medical imaging training sets enables more efficient machine learning
被引:8
|
作者:
Chinn, Erin
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif San Francisco, Bakar Computat Hlth Sci Inst, Dept Med, Dept Radiol,Div Cardiol, San Francisco, CA USAUniv Calif San Francisco, Bakar Computat Hlth Sci Inst, Dept Med, Dept Radiol,Div Cardiol, San Francisco, CA USA
Chinn, Erin
[1
]
Arora, Rohit
论文数: 0引用数: 0
h-index: 0
机构:
Beth Israel Deaconess Med Ctr, Dept Pathol, Div Clin Pathol, Boston, MA USAUniv Calif San Francisco, Bakar Computat Hlth Sci Inst, Dept Med, Dept Radiol,Div Cardiol, San Francisco, CA USA
Arora, Rohit
[2
]
Arnaout, Ramy
论文数: 0引用数: 0
h-index: 0
机构:
Beth Israel Deaconess Med Ctr, Dept Pathol, Div Clin Pathol, Boston, MA USA
Beth Israel Deaconess Med Ctr, Dept Med, Div Clin Informat, Boston, MA USAUniv Calif San Francisco, Bakar Computat Hlth Sci Inst, Dept Med, Dept Radiol,Div Cardiol, San Francisco, CA USA
Arnaout, Ramy
[2
,3
]
Arnaout, Rima
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif San Francisco, Bakar Computat Hlth Sci Inst, Dept Med, Dept Radiol,Div Cardiol, San Francisco, CA USA
Univ Calif San Francisco, Bakar Computat Hlth Sci Inst, Dept Radiol, Dept Med,Div Cardiol, 521 Parnassus Ave,Rm 6222, San Francisco, CA 94143 USAUniv Calif San Francisco, Bakar Computat Hlth Sci Inst, Dept Med, Dept Radiol,Div Cardiol, San Francisco, CA USA
Arnaout, Rima
[1
,4
]
机构:
[1] Univ Calif San Francisco, Bakar Computat Hlth Sci Inst, Dept Med, Dept Radiol,Div Cardiol, San Francisco, CA USA
[2] Beth Israel Deaconess Med Ctr, Dept Pathol, Div Clin Pathol, Boston, MA USA
[3] Beth Israel Deaconess Med Ctr, Dept Med, Div Clin Informat, Boston, MA USA
[4] Univ Calif San Francisco, Bakar Computat Hlth Sci Inst, Dept Radiol, Dept Med,Div Cardiol, 521 Parnassus Ave,Rm 6222, San Francisco, CA 94143 USA
deep learning;
medical imaging;
information theory;
instance selection;
data quality;
data efficiency;
D O I:
10.1093/jamia/ocad055
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
Objective Deep learning (DL) has been applied in proofs of concept across biomedical imaging, including across modalities and medical specialties. Labeled data are critical to training and testing DL models, but human expert labelers are limited. In addition, DL traditionally requires copious training data, which is computationally expensive to process and iterate over. Consequently, it is useful to prioritize using those images that are most likely to improve a model's performance, a practice known as instance selection. The challenge is determining how best to prioritize. It is natural to prefer straightforward, robust, quantitative metrics as the basis for prioritization for instance selection. However, in current practice, such metrics are not tailored to, and almost never used for, image datasets. Materials and Methods To address this problem, we introduce ENRICH-Eliminate Noise and Redundancy for Imaging Challenges-a customizable method that prioritizes images based on how much diversity each image adds to the training set. Results First, we show that medical datasets are special in that in general each image adds less diversity than in nonmedical datasets. Next, we demonstrate that ENRICH achieves nearly maximal performance on classification and segmentation tasks on several medical image datasets using only a fraction of the available images and without up-front data labeling. ENRICH outperforms random image selection, the negative control. Finally, we show that ENRICH can also be used to identify errors and outliers in imaging datasets. Conclusions ENRICH is a simple, computationally efficient method for prioritizing images for expert labeling and use in DL.
机构:
Univ Chicago, Dept Radiol, MC 2026,5841 S Maryland Ave, Chicago, IL 60637 USAUniv Chicago, Dept Radiol, MC 2026,5841 S Maryland Ave, Chicago, IL 60637 USA
机构:
Univ N Carolina, Dept Radiol, Chapel Hill, NC 27515 USA
Univ N Carolina, BRIC, Chapel Hill, NC USAUniv N Carolina, Dept Radiol, Chapel Hill, NC 27515 USA
Shen, Dinggang
Wu, Guorong
论文数: 0引用数: 0
h-index: 0
机构:
Univ N Carolina, Dept Radiol, Chapel Hill, NC 27515 USA
Univ N Carolina, BRIC, Chapel Hill, NC USAUniv N Carolina, Dept Radiol, Chapel Hill, NC 27515 USA
Wu, Guorong
Zhang, Daoqiang
论文数: 0引用数: 0
h-index: 0
机构:
Nanjing Univ Aeronaut & Astronaut, Dept Comp Sci, Nanjing, Peoples R ChinaUniv N Carolina, Dept Radiol, Chapel Hill, NC 27515 USA
Zhang, Daoqiang
Suzuki, Kenji
论文数: 0引用数: 0
h-index: 0
机构:
IIT, Med Imaging Res Ctr, Dept Elect & Comp Engn, Chicago, IL 60616 USAUniv N Carolina, Dept Radiol, Chapel Hill, NC 27515 USA
Suzuki, Kenji
Wang, Fei
论文数: 0引用数: 0
h-index: 0
机构:
AliveCor Inc, Los Angeles, CA USAUniv N Carolina, Dept Radiol, Chapel Hill, NC 27515 USA
Wang, Fei
Yan, Pingkun
论文数: 0引用数: 0
h-index: 0
机构:
Philips Res North Amer, Briarcliff Manor, NY 10510 USAUniv N Carolina, Dept Radiol, Chapel Hill, NC 27515 USA
机构:
Univ Chicago, Chicago, IL 60637 USA
IIT, Dept Elect & Comp Engn, Chicago, IL 60616 USA
IIT, Dept Biomed Engn, Chicago, IL 60616 USAUniv Chicago, Chicago, IL 60637 USA
Wernick, Miles N.
Yang, Yongyi
论文数: 0引用数: 0
h-index: 0
机构:Univ Chicago, Chicago, IL 60637 USA
Yang, Yongyi
Brankov, Jovan G.
论文数: 0引用数: 0
h-index: 0
机构:Univ Chicago, Chicago, IL 60637 USA
Brankov, Jovan G.
Yourganov, Grigori
论文数: 0引用数: 0
h-index: 0
机构:Univ Chicago, Chicago, IL 60637 USA
Yourganov, Grigori
Strother, Stephen C.
论文数: 0引用数: 0
h-index: 0
机构:
Mem Sloan Kettering Canc Ctr, New York, NY 10021 USA
VA Med Ctr, Minneapolis, MN USA
Univ Minnesota, Minneapolis, MN 55455 USA
Univ Toronto, Toronto, ON M5S 1A1, CanadaUniv Chicago, Chicago, IL 60637 USA
机构:
Seoul Natl Univ, Dept Mat Sci & Engn, Seoul 08826, South Korea
Seoul Natl Univ, Res Inst Adv Mat, Seoul 08826, South KoreaSeoul Natl Univ, Dept Mat Sci & Engn, Seoul 08826, South Korea
Hong, Changho
Kim, Jaehoon
论文数: 0引用数: 0
h-index: 0
机构:
Seoul Natl Univ, Dept Mat Sci & Engn, Seoul 08826, South Korea
Seoul Natl Univ, Res Inst Adv Mat, Seoul 08826, South KoreaSeoul Natl Univ, Dept Mat Sci & Engn, Seoul 08826, South Korea
Kim, Jaehoon
Kim, Jaesun
论文数: 0引用数: 0
h-index: 0
机构:
Seoul Natl Univ, Dept Mat Sci & Engn, Seoul 08826, South Korea
Seoul Natl Univ, Res Inst Adv Mat, Seoul 08826, South KoreaSeoul Natl Univ, Dept Mat Sci & Engn, Seoul 08826, South Korea
Kim, Jaesun
Jung, Jisu
论文数: 0引用数: 0
h-index: 0
机构:
Seoul Natl Univ, Dept Mat Sci & Engn, Seoul 08826, South Korea
Seoul Natl Univ, Res Inst Adv Mat, Seoul 08826, South KoreaSeoul Natl Univ, Dept Mat Sci & Engn, Seoul 08826, South Korea
Jung, Jisu
Ju, Suyeon
论文数: 0引用数: 0
h-index: 0
机构:
Seoul Natl Univ, Dept Mat Sci & Engn, Seoul 08826, South Korea
Seoul Natl Univ, Res Inst Adv Mat, Seoul 08826, South KoreaSeoul Natl Univ, Dept Mat Sci & Engn, Seoul 08826, South Korea
Ju, Suyeon
Choi, Jeong Min
论文数: 0引用数: 0
h-index: 0
机构:
Seoul Natl Univ, Dept Mat Sci & Engn, Seoul 08826, South Korea
Seoul Natl Univ, Res Inst Adv Mat, Seoul 08826, South KoreaSeoul Natl Univ, Dept Mat Sci & Engn, Seoul 08826, South Korea
Choi, Jeong Min
Han, Seungwu
论文数: 0引用数: 0
h-index: 0
机构:
Seoul Natl Univ, Dept Mat Sci & Engn, Seoul 08826, South Korea
Seoul Natl Univ, Res Inst Adv Mat, Seoul 08826, South Korea
Korea Inst Adv Study, Sch Computat Sci, Seoul, South KoreaSeoul Natl Univ, Dept Mat Sci & Engn, Seoul 08826, South Korea
Han, Seungwu
SCIENCE AND TECHNOLOGY OF ADVANCED MATERIALS-METHODS,
2023,
3
(01):
机构:
CALTECH, Div Biol & Biol Engn, MC 210-41,1200 E Calif Blvd, Pasadena, CA 91125 USACALTECH, Div Biol & Biol Engn, MC 210-41,1200 E Calif Blvd, Pasadena, CA 91125 USA
Wittmann, Bruce J.
Yue, Yisong
论文数: 0引用数: 0
h-index: 0
机构:
CALTECH, Dept Comp & Math Sci, MC 305-16,1200 E Calif Blvd, Pasadena, CA 91125 USACALTECH, Div Biol & Biol Engn, MC 210-41,1200 E Calif Blvd, Pasadena, CA 91125 USA
Yue, Yisong
Arnold, Frances H.
论文数: 0引用数: 0
h-index: 0
机构:
CALTECH, Div Biol & Biol Engn, MC 210-41,1200 E Calif Blvd, Pasadena, CA 91125 USA
CALTECH, Div Chem & Chem Engn, MC 210-41,1200 E Calif Blvd, Pasadena, CA 91125 USACALTECH, Div Biol & Biol Engn, MC 210-41,1200 E Calif Blvd, Pasadena, CA 91125 USA