Multivariate Weibull mixtures with proportional hazard restrictions for dwell-time-based session clustering with incomplete data

被引:7
|
作者
Mair, Patrick [1 ]
Hudec, Marcus [2 ]
机构
[1] Vienna Univ Econ & Business Adm, Dept Math & Stat, A-1090 Vienna, Austria
[2] Univ Vienna, A-1010 Vienna, Austria
关键词
EM algorithm; Incomplete data; Proportional hazards; Web usage mining; Weibull mixture models; EM ALGORITHM; MODEL; BEHAVIOR; DISTRIBUTIONS;
D O I
10.1111/j.1467-9876.2009.00665.x
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Emanating from classical Weibull mixture models we propose a framework for clustering survival data with various more parsimonious models by imposing restrictions on the distributional parameters. We show that these restrictions on the Weibull mixtures correspond to different proportional hazard restrictions across mixture components and Web page areas. A parametric cluster approach based on the EM algorithm is carried out on a multivariate data set. Our model set-up encompasses incomplete-data structures as well as censoring observations. We apply the methodology on retail data stemming from a global e-commerce company. Sessions are clustered with respect to the dwell times that a user spends on certain page areas. The cluster solution that is found allows for a detailed examination of the navigation behaviour in terms of the hazard and survivor functions within each component.
引用
收藏
页码:619 / 639
页数:21
相关论文
共 21 条
  • [1] Contrastive learning-based multi-view clustering for incomplete multivariate time series
    Li, Yurui
    Du, Mingjing
    Jiang, Xiang
    Zhang, Nan
    INFORMATION FUSION, 2025, 117
  • [2] Soft Subspace Based Ensemble Clustering for Multivariate Time Series Data
    He, Guoliang
    Jiang, Wenjun
    Peng, Rong
    Yin, Ming
    Han, Min
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (10) : 7761 - 7774
  • [3] VAR Model Based Clustering Method for Multivariate Time Series Data
    Deb S.
    Journal of Mathematical Sciences, 2019, 237 (6) : 754 - 765
  • [4] Clustering-based anomaly detection in multivariate time series data
    Li, Jinbo
    Izakian, Hesam
    Pedrycz, Witold
    Jamal, Iqbal
    Applied Soft Computing, 2021, 100
  • [5] Clustering Individuals Based on Multivariate EMA Time-Series Data
    Ntekouli, Mandani
    Spanakis, Gerasimos
    Waldorp, Lourens
    Roefs, Anne
    QUANTITATIVE PSYCHOLOGY, 2023, 422 : 161 - 171
  • [6] Clustering gene expression time course data using mixtures of multivariate t-distributions
    McNicholas, Paul D.
    Subedi, Sanjeena
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2012, 142 (05) : 1114 - 1127
  • [7] Clustering-based anomaly detection in multivariate time series data
    Li, Jinbo
    Izakian, Hesam
    Pedrycz, Witold
    Jamal, Iqbal
    APPLIED SOFT COMPUTING, 2021, 100
  • [8] Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
    Hallac, David
    Vare, Sagar
    Boyd, Stephen
    Leskovec, Jure
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 5254 - 5258
  • [9] Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
    Hallac, David
    Vare, Sagar
    Boyd, Stephen
    Leskovec, Jure
    KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 215 - 223
  • [10] Multivariate Time Series Data Clustering Method Based on Dynamic Time Warping and Affinity Propagation
    Wan, Xiaoji
    Li, Hailin
    Zhang, Liping
    Wu, Yenchun Jim
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021