A probabilistic theory of clustering

被引:44
|
作者
Dougherty, ER [1 ]
Brun, M
机构
[1] TAMU, Dept Elect Engn, College Stn, TX 77840 USA
[2] Univ Texas, MD Anderson Canc Ctr, Dept Pathol, Houston, TX USA
关键词
clustering; gene expression; microarray; statistical learning; inference; random sets; point processes;
D O I
10.1016/j.patcog.2003.10.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data clustering is typically considered a subjective process, which makes it problematic. For instance, how does one make statistical inferences based on clustering? The matter is different with pattern classification, for which two fundamental characteristics can be stated: (1) the error of a classifier can be estimated using "test data," and (2) a classifier can be learned using "training data." This paper presents a probabilistic theory of clustering, including both learning (training) and error estimation (testing). The theory is based on operators on random labeled point processes. It includes an error criterion in the context of random point sets and representation of the Bayes (optimal) cluster operator for a given random labeled point process. Training is illustrated using a nearest-neighbor approach, and trained cluster operators are compared to several classical clustering algorithms. (C) 2003 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:917 / 925
页数:9
相关论文
共 50 条
  • [1] A Probabilistic Clustering Theory of the Organization of Visual Short-Term Memory
    Orhan, A. Emin
    Jacobs, Robert A.
    [J]. PSYCHOLOGICAL REVIEW, 2013, 120 (02) : 297 - 328
  • [2] A PROBABILISTIC APPROACH TO CLUSTERING
    BRAILOVSKY, VL
    [J]. PATTERN RECOGNITION LETTERS, 1991, 12 (04) : 193 - 198
  • [3] Probabilistic Fair Clustering
    Esmaeili, Seyed A.
    Brubach, Brian
    Tsepenekas, Leonidas
    Dickerson, John P.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [4] Classification by probabilistic clustering
    Breuel, TM
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 1333 - 1336
  • [5] Scalable probabilistic clustering
    Bradley, PS
    Fayyad, UM
    Reina, CA
    [J]. COMPLEMENTARITY: APPLICATIONS, ALGORITHMS AND EXTENSIONS, 2001, 50 : 43 - 65
  • [6] Probabilistic quantum clustering
    Casana-Eslava, Raul V.
    Lisboa, Paulo J. G.
    Ortega-Martorell, Sandra
    Jarman, Ian H.
    Martin-Guerrero, Jose D.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2020, 194
  • [7] Penalized probabilistic clustering
    Lu, Zhengdong
    Leen, Todd K.
    [J]. NEURAL COMPUTATION, 2007, 19 (06) : 1528 - 1567
  • [8] Probabilistic framework for gene expression clustering validation based on Gene Ontology and graph theory
    Yuan, Yinyin
    Li, Chang-Tsun
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 625 - 628
  • [9] Probabilistic clustering of interval data
    Brito, Paula
    Pedro Duarte Silva, A.
    Dias, Jose G.
    [J]. INTELLIGENT DATA ANALYSIS, 2015, 19 (02) : 293 - 313
  • [10] Probabilistic Clustering of Wind Generators
    Ali, Muhammad
    Ilie, Irinel-Sorin
    Milanovic, Jovica V.
    Chicco, Gianfranco
    [J]. IEEE POWER AND ENERGY SOCIETY GENERAL MEETING 2010, 2010,