Information Splitting for Big Data Analytics

被引:9
|
作者
Zhu, Shengxin [1 ]
Gu, Tongxiang [1 ]
Xu, Xiaowen [1 ]
Mo, Zeyao [1 ]
机构
[1] Inst Appl Phys & Computat Math, Lab Computat Phys, POB 8009, Beijing 100088, Peoples R China
关键词
Observed information matrix; Fisher information matrix; Fisher scoring algorithm; linear mixed model; breeding model; geno-wide-association; variance parameter estimation; GENOME-WIDE ASSOCIATION; LINEAR MIXED MODELS; ALGORITHM;
D O I
10.1109/CyberC.2016.64
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Many statistical models require an estimation of unknown (co)-variance parameter(s). The estimation is usually obtained by maximizing a log-likelihood which involves log determinant terms. In principle, one requires the observed information-the negative Hessian matrix or the second derivative of the log-likelihood-to obtain an accurate maximum likelihood estimator according to the Newton method. When one uses the Fisher information, the expect value of the observed information, a simpler algorithm than the Newton method is obtained as the Fisher scoring algorithm. With the advance in high-throughput technologies in the biological sciences, recommendation systems and social networks, the sizes of data sets-and the corresponding statistical models-have suddenly increased by several orders of magnitude. Neither the observed information nor the Fisher information is easy to obtained for these big data sets. This paper introduces an information splitting technique to simplify the computation. After splitting the mean of the observed information and the Fisher information, an simpler approximate Hessian matrix for the log-likelihood can be obtained. This approximated Hessian matrix can significantly reduce computations, and makes the linear mixed model applicable for big data sets. Such a spitting and simpler formulas heavily depend on matrix algebra transforms, and applicable to large scale breeding model, genetics wide association analysis.
引用
收藏
页码:294 / 302
页数:9
相关论文
共 50 条
  • [1] Big Data Analytics for Information Security
    Szczypiorski, Krzysztof
    Wang, Liqiang
    Luo, Xiangyang
    Ye, Dengpan
    [J]. SECURITY AND COMMUNICATION NETWORKS, 2018,
  • [2] Big Data fingerprinting information analytics for sustainability
    Kobusinska, Anna
    Pawluczuk, Kamil
    Brzezinski, Jerzy
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 86 : 1321 - 1337
  • [3] Data Modelling and Information Infrastructure in Big Data Analytics Preface
    Bhalla, Subhash
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2018, 16 (04) : 335 - 336
  • [4] Incorporating Big Data Analytics into Enterprise Information Systems
    Sun, Zhaohao
    Pambel, Francisca
    Wang, Fangwei
    [J]. INFORMATION AND COMMUNICATION TECHNOLOGY, 2015, 9357 : 300 - 309
  • [5] Information Fingerprint for Secure Industrial Big Data Analytics
    Jiang, Xiaoyu
    Ge, Zhiqiang
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (04) : 2641 - 2650
  • [6] Business analytics and big data research in information systems
    Janiesch, Christian
    Dinter, Barbara
    Mikalef, Patrick
    Tona, Olgerta
    [J]. JOURNAL OF BUSINESS ANALYTICS, 2022, 5 (01) : 1 - 7
  • [7] Perceptual Reasoning Managed Big Data Analytics and Information Fusion
    Kadar, Ivan
    [J]. SIGNAL PROCESSING, SENSOR FUSION, AND TARGET RECOGNITION XXII, 2013, 8745
  • [8] Big Data Analytics for Flood Information Management in Kelantan, Malaysia
    Yusoff, Aziyati
    Din, Norashidah Md
    Yussof, Salman
    Khan, Samee Ullah
    [J]. 2015 IEEE STUDENT CONFERENCE ON RESEARCH AND DEVELOPMENT (SCORED), 2015, : 311 - 316
  • [9] Tutorial on big spectrum data analytics for space information networks
    Guoru Ding
    Lin Li
    Juzhen Wang
    Yumeng Wang
    Lei Chen
    [J]. EURASIP Journal on Wireless Communications and Networking, 2018
  • [10] Perceptual Reasoning Managed Big Data Analytics and Information Fusion
    Kadar, Ivan
    [J]. SIGNAL PROCESSING, SENSOR FUSION, AND TARGET RECOGNITION XXII, 2013, 8745