On Optimal Data Compression in Multiterminal Statistical Inference

被引:11
|
作者
Amari, Shun-ichi [1 ]
机构
[1] RIKEN Brain Sci Inst, Wako, Saitama 3510198, Japan
关键词
Data compression; Fisher information; linear-threshold encoding; multiterminal source; multiterminal statistical inference; INFORMATION;
D O I
10.1109/TIT.2011.2162270
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The multiterminal theory of statistical inference deals with the problem of estimating or testing the correlation of letters generated from two (or many) correlated information sources under the restriction of a certain transmission rate for each source. A typical example is two binary sources with joint probability p(x, y) where the correlation of x and y is to be tested or estimated. Given n iid observations x(n) = x(1) ... x(n) and y(n) = y(1) ... y(n), only k = rn (0 < r < 1) bits each can be transmitted to a common destination. What is the optimal data compression for statistical inference? A simple idea is to send the first k letters of x(n) and y(n). A simpler problem is the helper case where the optimal data compression of x(n) is searched for under the condition that all of y(n) are transmitted. It is a long standing problem to determine if there is a better data compression scheme than this simple scheme of sending first k letters. The present paper searches for the optimal data compression under the framework of linear-threshold encoding and shows that there is a better data compression scheme depending on the value of correlation. To this end, we evaluate the Fisher information in the class of linear-threshold compression schemes. It is also proved that the simple scheme is optimal when x and y are independent or their correlation is not too large.
引用
收藏
页码:5577 / 5587
页数:11
相关论文
共 50 条
  • [21] Statistical inference in massive data sets
    Li, Runze
    Lin, Dennis K. J.
    Li, Bing
    APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2013, 29 (05) : 399 - 409
  • [22] Imputation for statistical inference with coarse data
    Kim, Jae Kwang
    Hong, Minki
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2012, 40 (03): : 604 - 618
  • [23] Statistical model and inference for pharmacokinetic data
    Chen, Yuh-Ing
    Lin, Wen-Ming
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2014, 84 (12) : 2607 - 2618
  • [24] DISTRIBUTED STATISTICAL INFERENCE FOR MASSIVE DATA
    Chen, Song Xi
    Peng, Liuhua
    ANNALS OF STATISTICS, 2021, 49 (05): : 2851 - 2869
  • [25] Statistical inference via data science
    Shalabh
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2021, 184 (03) : 1155 - 1155
  • [26] Principled Statistical Inference in Data Science
    Kuffner, Todd A.
    Young, Alastair
    STATISTICAL DATA SCIENCE, 2018, : 21 - 36
  • [27] Statistical inference for trends in spatiotemporal data
    Ives, Anthony R.
    Zhu, Likai
    Wang, Fangfang
    Zhu, Jun
    Morrow, Clay J.
    Radeloff, Volker C.
    REMOTE SENSING OF ENVIRONMENT, 2021, 266
  • [28] REDSHIFT DATA AND STATISTICAL-INFERENCE
    NEWMAN, WI
    HAYNES, MP
    TERZIAN, Y
    ASTROPHYSICAL JOURNAL, 1994, 431 (01): : 147 - 155
  • [29] Statistical inference for missing data mechanisms
    Zhao, Yang
    STATISTICS IN MEDICINE, 2020, 39 (28) : 4325 - 4333
  • [30] Simultaneous Statistical Inference for Epigenetic Data
    Schildknecht, Konstantin
    Olek, Sven
    Dickhaus, Thorsten
    PLOS ONE, 2015, 10 (05):