On Optimal Data Compression in Multiterminal Statistical Inference

被引:11
|
作者
Amari, Shun-ichi [1 ]
机构
[1] RIKEN Brain Sci Inst, Wako, Saitama 3510198, Japan
关键词
Data compression; Fisher information; linear-threshold encoding; multiterminal source; multiterminal statistical inference; INFORMATION;
D O I
10.1109/TIT.2011.2162270
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The multiterminal theory of statistical inference deals with the problem of estimating or testing the correlation of letters generated from two (or many) correlated information sources under the restriction of a certain transmission rate for each source. A typical example is two binary sources with joint probability p(x, y) where the correlation of x and y is to be tested or estimated. Given n iid observations x(n) = x(1) ... x(n) and y(n) = y(1) ... y(n), only k = rn (0 < r < 1) bits each can be transmitted to a common destination. What is the optimal data compression for statistical inference? A simple idea is to send the first k letters of x(n) and y(n). A simpler problem is the helper case where the optimal data compression of x(n) is searched for under the condition that all of y(n) are transmitted. It is a long standing problem to determine if there is a better data compression scheme than this simple scheme of sending first k letters. The present paper searches for the optimal data compression under the framework of linear-threshold encoding and shows that there is a better data compression scheme depending on the value of correlation. To this end, we evaluate the Fisher information in the class of linear-threshold compression schemes. It is also proved that the simple scheme is optimal when x and y are independent or their correlation is not too large.
引用
收藏
页码:5577 / 5587
页数:11
相关论文
共 50 条
  • [41] Statistical Inference, Learning and Models in Big Data
    Franke, Beate
    Plante, Jean-Francois
    Roscher, Ribana
    Lee, En-Shiun Annie
    Smyth, Cathal
    Hatefi, Armin
    Chen, Fuqi
    Gil, Einat
    Schwing, Alexander
    Selvitella, Alessandro
    Hoffman, Michael M.
    Grosse, Roger
    Hendricks, Dieter
    Reid, Nancy
    INTERNATIONAL STATISTICAL REVIEW, 2016, 84 (03) : 371 - 389
  • [42] Linear statistical inference for random fuzzy data
    Nather, W
    STATISTICS, 1997, 29 (03) : 221 - 240
  • [43] Probability distributions and statistical inference for axial data
    Barry C. Arnold
    Ashis SenGupta
    Environmental and Ecological Statistics, 2006, 13 : 271 - 285
  • [44] Statistical inference on series of atmospheric chemistry data
    Mohapl, J
    ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 2000, 7 (04) : 357 - 384
  • [45] Statistical inference for serial dilution assay data
    Lee, MLT
    Whitmore, GA
    BIOMETRICS, 1999, 55 (04) : 1215 - 1220
  • [46] Statistical Inference for Grouped Field Failure Data
    Chen, Piao
    Ye, Zhi-Sheng
    THEORY AND PRACTICE OF QUALITY AND RELIABILITY ENGINEERING IN ASIA INDUSTRY, 2017, : 233 - 247
  • [47] Statistical inference in dynamic panel data models
    Lai, Tze Leung
    Small, Dylan S.
    Liu, Jia
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2008, 138 (09) : 2763 - 2776
  • [48] Statistical inference on series of atmospheric chemistry data*
    Jaroslav Mohapl
    Environmental and Ecological Statistics, 2000, 7 : 357 - 384
  • [49] Bayesian statistical inference based on rounded data
    Zhao, Ningning
    Bai, Zhidong
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2020, 49 (01) : 135 - 146
  • [50] Probability distributions and statistical inference for axial data
    Arnold, Barry C.
    SenGupta, Ashis
    ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 2006, 13 (03) : 271 - 285