Probabilistic topic models for sequence data

被引:21
|
作者
Barbieri, Nicola [1 ]
Manco, Giuseppe [2 ]
Ritacco, Ettore [2 ]
Carnuccio, Marco [3 ]
Bevacqua, Antonio [3 ]
机构
[1] Yahoo Res, Barcelona, Spain
[2] Italian Natl Res Council, Inst High Performance Comp & Networks ICAR, I-87036 Arcavacata Di Rende, CS, Italy
[3] Univ Calabria, Dept Elect Informat & Syst, I-87036 Arcavacata Di Rende, CS, Italy
关键词
Recommender systems; Collaborative filtering; Probabilistic topic models; Performance;
D O I
10.1007/s10994-013-5391-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Probabilistic topic models are widely used in different contexts to uncover the hidden structure in large text corpora. One of the main (and perhaps strong) assumption of these models is that generative process follows a bag-of-words assumption, i.e. each token is independent from the previous one. We extend the popular Latent Dirichlet Allocation model by exploiting three different conditional Markovian assumptions: (i) the token generation depends on the current topic and on the previous token; (ii) the topic associated with each observation depends on topic associated with the previous one; (iii) the token generation depends on the current and previous topic. For each of these modeling assumptions we present a Gibbs Sampling procedure for parameter estimation. Experimental evaluation over real-word data shows the performance advantages, in terms of recall and precision, of the sequence-modeling approaches.
引用
收藏
页码:5 / 29
页数:25
相关论文
共 50 条
  • [1] Probabilistic topic models for sequence data
    Nicola Barbieri
    Giuseppe Manco
    Ettore Ritacco
    Marco Carnuccio
    Antonio Bevacqua
    [J]. Machine Learning, 2013, 93 : 5 - 29
  • [2] Probabilistic Topic Models for Text Data Retrieval and Analysis
    Zhai, ChengXiang
    [J]. SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 1399 - 1401
  • [3] Probabilistic Topic Models
    Blei, David
    Carin, Lawrence
    Dunson, David
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2010, 27 (06) : 55 - 65
  • [4] Probabilistic Topic Models
    Blei, David M.
    [J]. COMMUNICATIONS OF THE ACM, 2012, 55 (04) : 77 - 84
  • [5] A Tutorial on Probabilistic Topic Models for Text Data Retrieval and Analysis
    Zhai, ChengXiang
    Geigle, Chase
    [J]. ACM/SIGIR PROCEEDINGS 2018, 2018, : 1395 - 1397
  • [6] ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data
    Kashiwabara, Andre Yoshiaki
    Bonadio, Igor
    Onuchic, Vitor
    Amado, Felipe
    Mathias, Rafael
    Durham, Alan Mitchell
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2013, 9 (10)
  • [7] Probabilistic models for topic detection and tracking
    Walls, F
    Jin, H
    Sista, S
    Schwartz, R
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 521 - 524
  • [8] Probabilistic models for topic detection and tracking
    GTE/BBN Technologies, Cambridge, MA, United States
    [J]. ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, (521-524):
  • [9] Incorporating Probabilistic Knowledge into Topic Models
    Yao, Liang
    Zhang, Yin
    Wei, Baogang
    Qian, Hongze
    Wang, Yibing
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART II, 2015, 9078 : 586 - 597
  • [10] Recent Advances and Applications of Probabilistic Topic Models
    Wood, Ian
    [J]. BAYESIAN INFERENCE AND MAXIMUM ENTROPY METHODS IN SCIENCE AND ENGINEERING, MAXENT 2013, 2014, 1636 : 124 - 130