Symmetrization and overfitting in probabilistic latent semantic analysis

被引:1
|
作者
Leksin V.A. [1 ]
机构
[1] Moscow Institute of Physics and Technology, Dolgoprudnyi, Moscow oblast 141700
基金
俄罗斯基础研究基金会;
关键词
Collaborative filtering; Customer environment analysis; Latent profiles; Overfitting; Probabilistic latent semantic analysis; Symmetric models;
D O I
10.1134/S1054661809040014
中图分类号
学科分类号
摘要
An algorithm is proposed for revealing latent user's interests from the observable protocol of users behavior, e.g., site visits. The algorithm combines the ideas of customer environment analysis and probabilistic latent semantic analysis. A quality criterion based on the classification of preliminarily labeled sites is introduced to optimize the algorithm parameters and compare algorithms. The experiments show that the quality has an optimum by the essential parameters of the algorithm, however the attempt of too precise optimization can lead to overfitting. © 2009 Pleiades Publishing, Ltd.
引用
收藏
页码:565 / 574
页数:9
相关论文
共 50 条
  • [1] Probabilistic latent semantic analysis
    Hofmann, T
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 1999, : 289 - 296
  • [2] COMPARISON OF LATENT SEMANTIC ANALYSIS AND PROBABILISTIC LATENT SEMANTIC ANALYSIS FOR DOCUMENTS CLUSTERING
    Kuta, Marcin
    Kitowski, Jacek
    COMPUTING AND INFORMATICS, 2014, 33 (03) : 652 - 666
  • [3] Latent semantic indexing: A probabilistic analysis
    Papadimitriou, CH
    Raghavan, P
    Tamaki, H
    Vempala, S
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2000, 61 (02) : 217 - 235
  • [4] Unsupervised learning by probabilistic latent semantic analysis
    Hofmann, T
    MACHINE LEARNING, 2001, 42 (1-2) : 177 - 196
  • [5] Unsupervised Learning by Probabilistic Latent Semantic Analysis
    Thomas Hofmann
    Machine Learning, 2001, 42 : 177 - 196
  • [6] Brain Morphometry by Probabilistic Latent Semantic Analysis
    Castellani, U.
    Perina, A.
    Murino, V.
    Bellani, M.
    Rambaldelli, G.
    Tansella, M.
    Brambilla, P.
    MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2010, PT II,, 2010, 6362 : 177 - +
  • [7] Regularized Probabilistic Latent Semantic Analysis with Continuous Observations
    Zhang, Hao
    Edwards, Richard
    Parker, Lynne
    2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 1, 2012, : 560 - 563
  • [8] Efficient Probabilistic Latent Semantic Analysis through Parallelization
    Wan, Raymond
    Anh, Vo Ngoc
    Mamitsuka, Hiroshi
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2009, 5839 : 432 - +
  • [9] Randomized Probabilistic Latent Semantic Analysis for Scene Recognition
    Rodner, Erik
    Denzler, Joachim
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, PROCEEDINGS, 2009, 5856 : 945 - 953
  • [10] Action categorization by structural probabilistic latent semantic analysis
    Zhang, Jianguo
    Gong, Shaogang
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2010, 114 (08) : 857 - 864