Towards algorithmic analytics for large-scale datasets

被引:55
|
作者
Bzdok, Danilo [1 ,2 ,3 ]
Nichols, Thomas E. [4 ,5 ]
Smith, Stephen M. [4 ]
机构
[1] Rhein Westfal TH Aachen, Dept Psychiat Psychotherapy & Psychosomat, Aachen, Germany
[2] JARA, Translat Brain Med, Aachen, Germany
[3] CEA Saclay, Neurospin, INRIA, Parietal Team, Gif Sur Yvette, France
[4] Univ Oxford, Wellcome Trust Ctr Integrat Neuroimaging WIN FMRI, Oxford, England
[5] Univ Oxford, Big Data Inst, Oxford, England
关键词
BAYESIAN-INFERENCE; PERMUTATION TESTS; BRAIN; CONNECTIVITY; MODELS; PARCELLATION; PITFALLS; PRIMER;
D O I
10.1038/s42256-019-0069-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classical statistical analysis in many empirical sciences has lagged behind modern trends in analytics for large-scale datasets. The authors discuss the influence of more variables, larger sample sizes, open data sources for analysis and assessment, and 'black box' prediction methods on the empirical sciences, and provide examples from imaging neuroscience. The traditional goal of quantitative analytics is to find simple, transparent models that generate explainable insights. In recent years, large-scale data acquisition enabled, for instance, by brain scanning and genomic profiling with microarray-type techniques, has prompted a wave of statistical inventions and innovative applications. Here we review some of the main trends in learning from 'big data' and provide examples from imaging neuroscience. Some main messages we find are that modern analysis approaches (1) tame complex data with parameter regularization and dimensionality-reduction strategies, (2) are increasingly backed up by empirical model validations rather than justified by mathematical proofs, (3) will compare against and build on open data and consortium repositories, as well as (4) often embrace more elaborate, less interpretable models to maximize prediction accuracy.
引用
收藏
页码:296 / 306
页数:11
相关论文
共 50 条
  • [11] Special section on large-scale analytics
    Lehner, Wolfgang
    Franklin, Michael J.
    [J]. VLDB JOURNAL, 2012, 21 (05): : 587 - 588
  • [12] Special section on large-scale analytics
    Wolfgang Lehner
    Michael J. Franklin
    [J]. The VLDB Journal, 2012, 21 : 587 - 588
  • [13] NORA: Towards Large-Scale Vehicular Analytics for Driving Environment Monitoring/Assessment
    Grimm, Donald K.
    Bai, Fan
    Chen, Jinzhu
    Yu, Bo
    Saraydar, Cem
    Govindan, Ramesh
    [J]. IEEE OPEN JOURNAL OF VEHICULAR TECHNOLOGY, 2023, 4 : 618 - 632
  • [14] Algorithmic Television in the Age of Large-scale Customization
    Shapiro, Stephen
    [J]. TELEVISION & NEW MEDIA, 2020, 21 (06) : 658 - 663
  • [15] Algorithmic Transparency of Large-Scale *AIDA Programs
    Watanobe, Yutaka
    Mirenkov, Nikolay
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2020, 30 (09) : 1263 - 1288
  • [16] RANSAC-SVM for Large-Scale Datasets
    Nishida, Kenji
    Kurita, Takio
    [J]. 19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3767 - 3770
  • [17] MedDialog: Large-scale Medical Dialogue Datasets
    Zeng, Guangtao
    Yang, Wenmian
    Ju, Zeqian
    Yang, Yue
    Wang, Sicheng
    Zhang, Ruisi
    Zhou, Meng
    Zeng, Jiaqi
    Dong, Xiangyu
    Zhang, Ruoyu
    Fang, Hongchao
    Zhu, Penghui
    Chen, Shu
    Xie, Pengtao
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9241 - 9250
  • [18] Map Matching Algorithm for Large-scale Datasets
    Fiedler, David
    Cap, Michal
    Nykl, Jan
    Zilecky, Pavol
    [J]. ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 3, 2022, : 500 - 508
  • [19] Momentum Online LDA for Large-scale Datasets
    Ouyang, Jihong
    Lu, You
    Li, Ximing
    [J]. 21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 1075 - 1076
  • [20] Large-Scale Datasets in Special Education Research
    Griffin, Megan M.
    Steinbrecher, Trisha D.
    [J]. USING SECONDARY DATASETS TO UNDERSTAND PERSONS WITH DEVELOPMENTAL DISABILITIES AND THEIR FAMILIES, 2013, 45 : 155 - 183