Preservation of Statistically Significant Patterns in Multiresolution 0-1 Data

被引:0
|
作者
Adhikari, Prem Raj [1 ]
Hollmen, Jaakko [1 ]
机构
[1] Aalto Univ, Sch Sci & Technol, Dept Informat & Comp Sci, FI-00076 Espoo, Finland
来源
关键词
Multiresolution data; statistical significance; frequent item-set; mixture modelling;
D O I
暂无
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Measurements in biology are made with high throughput and high resolution techniques often resulting in data in multiple resolutions. Currently, available standard algorithms can only handle data in one resolution. Generative models such as mixture models are often used to model such data. However, significance of the patterns generated by generative models has so far received inadequate attention. This paper analyses the statistical significance of the patterns preserved in sampling between different resolutions and when sampling from a generative model. Furthermore, we study the effect of noise on the likelihood with respect to the changing resolutions and sample size. Finite mixture of multivariate Bernoulli distribution is used to model amplification patterns in cancer in multiple resolutions. Statistically significant itemsets are identified in original data and data sampled from the generative models using randomization and their relationships are studied. The results showed that statistically significant itemsets are effectively preserved by mixture models. The preservation is more accurate in coarse resolution compared to the finer resolution. Furthermore, the effect of noise on data on higher resolution and with smaller number of sample size is higher than the data in lower resolution and with higher number of sample size.
引用
收藏
页码:86 / 97
页数:12
相关论文
共 50 条
  • [1] Mixture Models from Multiresolution 0-1 Data
    Adhikari, Prem Raj
    Hollmen, Jaakko
    DISCOVERY SCIENCE, 2013, 8140 : 1 - 16
  • [2] 0-1 laws by preservation
    Lacoste, T
    THEORETICAL COMPUTER SCIENCE, 1997, 184 (1-2) : 237 - 245
  • [3] Permutation Structure in 0-1 Data
    Mannila, Heikki
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2011, 6911 : 7 - 7
  • [4] Mining Statistically Significant Sequential Patterns
    Low-Kam, Cecile
    Raissi, Chedy
    Kaytoue, Mehdi
    Pei, Jian
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 488 - 497
  • [5] Statistically Significant Discriminative Patterns Searching
    Pham, Hoang Son
    Virlet, Gwendal
    Lavenier, Dominique
    Termier, Alexandre
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2019, 2019, 11708 : 105 - 115
  • [6] Geometric and combinatorial tiles in 0-1 data
    Gionis, A
    Mannila, H
    Seppänen, JK
    KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2004, PROCEEDINGS, 2004, 3202 : 173 - 184
  • [7] The 0-1 knapsack problem with fuzzy data
    Adam Kasperski
    Michał Kulej
    Fuzzy Optimization and Decision Making, 2007, 6 : 163 - 172
  • [8] The 0-1 knapsack problem with fuzzy data
    Kasperski, Adam
    Kulej, Michal
    FUZZY OPTIMIZATION AND DECISION MAKING, 2007, 6 (02) : 163 - 172
  • [9] On the 0-1 matrices whose squares are 0-1 matrices
    Wu, Honglin
    LINEAR ALGEBRA AND ITS APPLICATIONS, 2010, 432 (11) : 2909 - 2924
  • [10] Finding checkerboard patterns via fractional 0-1 programming
    Trapp, Andrew
    Prokopyev, Oleg A.
    Busygin, Stanislav
    JOURNAL OF COMBINATORIAL OPTIMIZATION, 2010, 20 (01) : 1 - 26