Clustering workflow requirements using compression dissimilarity measure

被引:0
|
作者
Wei, Li [1 ]
Handley, John
Martin, Nathaniel
Sun, Tong
Keogh, Eamonn
机构
[1] Univ Calif Riverside, Dept Comp Sci, Riverside, CA 92521 USA
[2] Xerox Corp, Stamford, CT 06902 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Xerox offers a bewildering array of printers and software configurations to satisfy the needs of production print shops. A configuration tool in the hands of sales analysts elicits requirements from customers and recommends a list of product configurations. This tool generates special question and answer case logs that provide useful historical data. Given the unusual semi-structured question and answer format, this data is not amenable to any standard document clustering method. We discovered that a hierarchical agglomerative approach using a compression-based dissimilarity measure (CDM) provided readily interpretable clusters. We compare this method empirically to two reasonable alternatives, latent semantic analysis and probabilistic latent semantic analysis, and conclude that CDM offers an accurate and easily implemented approach to validate and augment our configuration tool.
引用
收藏
页码:50 / 54
页数:5
相关论文
共 50 条
  • [1] SYMBOLIC CLUSTERING USING A NEW DISSIMILARITY MEASURE
    GOWDA, KC
    DIDAY, E
    [J]. PATTERN RECOGNITION, 1991, 24 (06) : 567 - 578
  • [2] A Compression-Based Dissimilarity Measure for Multi-task Clustering
    Nguyen Huy Thach
    Shao, Hao
    Tong, Bin
    Suzuki, Einoshin
    [J]. FOUNDATIONS OF INTELLIGENT SYSTEMS, 2011, 6804 : 123 - 132
  • [3] An expressive dissimilarity measure for relational clustering using neighbourhood trees
    Sebastijan Dumančić
    Hendrik Blockeel
    [J]. Machine Learning, 2017, 106 : 1523 - 1545
  • [4] An expressive dissimilarity measure for relational clustering using neighbourhood trees
    Dumancic, Sebastijan
    Blockeel, Hendrik
    [J]. MACHINE LEARNING, 2017, 106 (9-10) : 1523 - 1545
  • [5] A new evaluation measure using compression dissimilarity on text summarization
    Wang, Tong
    Chen, Ping
    Simovici, Dan
    [J]. APPLIED INTELLIGENCE, 2016, 45 (01) : 127 - 134
  • [6] A new evaluation measure using compression dissimilarity on text summarization
    Tong Wang
    Ping Chen
    Dan Simovici
    [J]. Applied Intelligence, 2016, 45 : 127 - 134
  • [7] A New Dissimilarity Measure for Clustering Seismic Signals
    Benvegna, Francesco
    D'Alessando, Antonin
    Lo Bosco, Giosue
    Luzio, Dario
    Pinello, Luca
    Tegolo, Domenico
    [J]. IMAGE ANALYSIS AND PROCESSING - ICIAP 2011, PT II, 2011, 6979 (II): : 434 - +
  • [8] Graph Enhanced Fuzzy Clustering for Categorical Data Using a Bayesian Dissimilarity Measure
    Zhang, Chuanbin
    Chen, Long
    Zhao, Yin-Ping
    Wang, Yingxu
    Chen, C. L. Philip
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2023, 31 (03) : 810 - 824
  • [9] A dissimilarity measure for the k-Modes clustering algorithm
    Cao, Fuyuan
    Liang, Jiye
    Li, Deyu
    Bai, Liang
    Dang, Chuangyin
    [J]. KNOWLEDGE-BASED SYSTEMS, 2012, 26 : 120 - 127
  • [10] Pairwise-adaptive dissimilarity measure for document clustering
    D'hondt, Joris
    Vertommen, Joris
    Verhaegen, Paul-Armand
    Cattrysse, Dirk
    Duflou, Joost R.
    [J]. INFORMATION SCIENCES, 2010, 180 (12) : 2341 - 2358