Non-redundant data clustering

被引:27
|
作者
Gondek, D [1 ]
Hofmann, T [1 ]
机构
[1] Brown Univ, Dept Comp Sci, Providence, RI 02912 USA
关键词
D O I
10.1109/ICDM.2004.10104
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data clustering is a popular approach for automatically finding classes, concepts, or groups of patterns. In practice this discovery process should avoid redundancies with existing knowledge about class structures or groupings, and reveal novel, previously unknown aspects of the data. In order to deal with this problem, we present an extension of the information bottleneck framework, called coordinated conditional information bottleneck, which takes negative relevance information into account by maximizing a conditional mutual information score subject to constraints. Algorithmically, one can apply an alternating optimization scheme that can be used in conjunction with different types of numeric and non-numeric attributes. We present experimental results for applications in text mining and computer vision.
引用
收藏
页码:75 / 82
页数:8
相关论文
共 50 条
  • [21] Non-redundant and redundant post coding in OFDM systems
    Shah, S. F. A.
    Tewfik, A. H.
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 4407 - 4410
  • [22] Hierarchical control in redundant and non-redundant postural tasks
    James, Eric G.
    Newell, Karl M.
    [J]. HUMAN MOVEMENT SCIENCE, 2011, 30 (06) : 1167 - 1184
  • [23] Redundant versus non-redundant apertures for exoplanet detection
    Aime, C
    Soummer, R
    [J]. ASTRONOMY WITH HIGH CONTRAST IMAGING: FROM PLANETARY SYSTEMS TO ACTIVE GALACTIC NUCLEI, 2003, 8 : 353 - 359
  • [24] Broom: application for non-redundant storage of high throughput sequencing data
    Albayrak, Levent
    Khanipov, Kamil
    Golovko, George
    Fofanov, Yuriy
    [J]. BIOINFORMATICS, 2019, 35 (01) : 143 - 145
  • [25] MEASUREMENT OF HYDROPHOBICITY DISTRIBUTION IN PROTEINS - NON-REDUNDANT PROTEIN DATA BANK
    Kinga, Salapa
    Kalinowska, Barbara
    Jadczyk, Tomasz
    Roterman, Irena
    [J]. BIO-ALGORITHMS AND MED-SYSTEMS, 2012, 8 (03) : 327 - 337
  • [26] Cloud data modelling employing a unified, non-redundant triangular mesh
    Sun, W
    Bradley, C
    Zhang, YF
    Loh, HT
    [J]. COMPUTER-AIDED DESIGN, 2001, 33 (02) : 183 - 193
  • [27] Micromorphic continua: non-redundant formulations
    Romano, Giovanni
    Barretta, Raffaele
    Diaco, Marina
    [J]. CONTINUUM MECHANICS AND THERMODYNAMICS, 2016, 28 (06) : 1659 - 1670
  • [28] Nonnegative non-redundant tensor decomposition
    Olexiy Kyrgyzov
    Deniz Erdogmus
    [J]. Frontiers of Mathematics in China, 2013, 8 : 41 - 61
  • [29] Discovering statistically non-redundant subgroups
    Li, Jiuyong
    Liu, Jixue
    Toivonen, Hannu
    Satou, Kenji
    Sun, Youqiang
    Sun, Bingyu
    [J]. KNOWLEDGE-BASED SYSTEMS, 2014, 67 : 315 - 327
  • [30] A non-redundant data set of nanobody-antigen crystal structures
    Zavrtanik, Uros
    Hadzi, San
    [J]. DATA IN BRIEF, 2019, 24