Perspectives on making big data analytics work for oncology

被引:27
|
作者
El Naqa, Issam [1 ]
机构
[1] Univ Michigan, Dept Radiat Oncol, Ann Arbor, MI 48109 USA
关键词
Big data; Oncology; Machine learning; Clinical decision support; PREDICT RADIATION PNEUMONITIS; DOSE-VOLUME; BAYESIAN NETWORK; NEURAL-NETWORK; RADIOTHERAPY OUTCOMES; TEXTURAL FEATURES; PROSTATE-CANCER; TUMOR RESPONSE; NECK-CANCER; FDG-PET;
D O I
10.1016/j.ymeth.2016.08.010
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Oncology, with its unique combination of clinical, physical, technological, and biological data provides an ideal case study for applying big data analytics to improve cancer treatment safety and outcomes. An oncology treatment course such as chemoradiotherapy can generate a large pool of information carrying the 5 Vs hallmarks of big data. This data is comprised of a heterogeneous mixture of patient demographics, radiationichemo dosimetry, multimodality imaging features, and biological markers generated over a treatment period that can span few days to several weeks. Efforts using commercial and in-house tools are underway to facilitate data aggregation, ontology creation, sharing, visualization and varying analytics in a secure environment. However, open questions related to proper data structure representation and effective analytics tools to support oncology decision-making need to be addressed. It is recognized that oncology data constitutes a mix of structured (tabulated) and unstructured (electronic documents) that need to be processed to facilitate searching and subsequent knowledge discovery from relational or NoSQL databases. In this context, methods based on advanced analytics and image feature extraction for oncology applications will be discussed. On the other hand, the classical p (variables) >> n (samples) inference problem of statistical learning is challenged in the Big data realm and this is particularly true for oncology applications where p-omics is witnessing exponential growth while the number of cancer incidences has generally plateaued over the past 5-years leading to a quasi-linear growth in samples per patient. Within the Big data paradigm, this kind of phenomenon may yield undesirable effects such as echo chamber anomalies, Yule-Simpson reversal paradox, or misleading ghost analytics. In this work, we will present these effects as they pertain to oncology and engage small thinking methodologies to counter these effects ranging from incorporating prior knowledge, using information-theoretic techniques to modern ensemble machine learning approaches or combination of these. We will particularly discuss the pros and cons of different approaches to improve mining of big data in oncology. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:32 / 44
页数:13
相关论文
共 50 条
  • [31] Autonomic deployment decision making for big data analytics applications in the cloud
    Qinghua Lu
    Zheng Li
    Weishan Zhang
    Laurence T. Yang
    Soft Computing, 2017, 21 : 4501 - 4512
  • [32] Hazy: Making It Easier to Build and Maintain Big-Data Analytics
    Kumar, Arun
    Niu, Feng
    Re, Christopher
    COMMUNICATIONS OF THE ACM, 2013, 56 (03) : 40 - 49
  • [33] Big Data: The Structure & Value of Big Data Analytics
    Kim, Hak J.
    AMCIS 2015 PROCEEDINGS, 2015,
  • [34] Big data analytics and big data science: a survey
    Chen, Yong
    Chen, Hong
    Gorkhali, Anjee
    Lu, Yang
    Ma, Yiqian
    Li, Ling
    JOURNAL OF MANAGEMENT ANALYTICS, 2016, 3 (01) : 1 - 42
  • [35] Situated Big Data and Big Data Analytics for Healthcare
    Sterling, Mark
    2017 IEEE GLOBAL HUMANITARIAN TECHNOLOGY CONFERENCE (GHTC), 2017,
  • [36] Introduction to big data and analytics: Pathways to maturity the original big data and analytics minitrack
    Kaisler, Stephen H.
    Armour, Frank J.
    Espinosa, J. Alberto
    Proceedings of the Annual Hawaii International Conference on System Sciences, 2020, 2020-January : 940 - 942
  • [37] Introduction to big data and analytics: Pathways to maturity the original big data and analytics minitrack
    Kaisler, Stephen H.
    Armour, Frank J.
    Espinosa, J. Alberto
    Proceedings of the Annual Hawaii International Conference on System Sciences, 2021, 2020-January : 936 - 939
  • [38] Security Analytics: Big Data Analytics for Cybersecurity
    Mahmood, Tariq
    Afzal, Uzma
    2013 2ND NATIONAL CONFERENCE ON INFORMATION ASSURANCE (NCIA), 2013, : 129 - 134
  • [39] A Taxonomy of Big Data Analytics in Circular Economy: Perspectives from the Fashion Industry
    Pereira Marquesone, Rosangela de Fatima
    Melo de Brito Carvalho, Tereza Cristina
    INFORMATION SYSTEMS AND TECHNOLOGIES, WORLDCIST 2022, VOL 1, 2022, 468 : 32 - 41
  • [40] Big Data Analytics in Saudi Arabian Higher Education: Technological and Human Perspectives
    Aseeri, Maher
    Kang, Kyeong
    EDUCATION EXCELLENCE AND INNOVATION MANAGEMENT: A 2025 VISION TO SUSTAIN ECONOMIC DEVELOPMENT DURING GLOBAL CHALLENGES, 2020, : 9113 - 9118