A Computationally Efficient Algorithm for Learning Topical Collocation Models

被引:0
|
作者
Zhao, Zhendong [1 ]
Du, Lan [1 ]
Borschinger, Benjamin [1 ,2 ]
Pate, John K. [1 ]
Ciaramita, Massimiliano [2 ]
Steedman, Mark [3 ]
Johnson, Mark [1 ]
机构
[1] Macquarie Univ, Dept Comp, N Ryde, NSW, Australia
[2] Google, Zurich, Switzerland
[3] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland
基金
澳大利亚研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most existing topic models make the bag-of-words assumption that words are generated independently, and so ignore potentially useful information about word order. Previous attempts to use collocations (short sequences of adjacent words) in topic models have either relied on a pipeline approach, restricted attention to bigrams, or resulted in models whose inference does not scale to large corpora. This paper studies how to simultaneously learn both collocations and their topic assignments. We present an efficient reformulation of the Adaptor Grammar-based topical collocation model (AG-colloc) (Johnson, 2010), and develop a point-wise sampling algorithm for posterior inference in this new formulation. We further improve the efficiency of the sampling algorithm by exploiting sparsity and parallelising inference. Experimental results derived in text classification, information retrieval and human evaluation tasks across a range of datasets show that this reformulation scales to hundreds of thousands of documents while maintaining the good performance of the AG-colloc model.
引用
收藏
页码:1460 / 1469
页数:10
相关论文
共 50 条
  • [1] A Computationally Efficient Algorithm for Building Statistical Color Models
    Dong, Mingzhi
    Yin, Liang
    Deng, Weihong
    Guo, Jun
    Xu, Weiran
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2012, : 402 - 407
  • [2] A COMPUTATIONALLY EFFICIENT ALGORITHM FOR SOLUTION OF MATHEMATICAL MODELS OF DISPLACEMENT WASHING
    Kaur, Satinder Pal
    Mittal, Ajay Kumar
    Kukreja, V. K.
    [J]. ADVANCES AND APPLICATIONS IN MATHEMATICAL SCIENCES, 2021, 20 (11): : 2549 - 2557
  • [3] COMPUTATIONALLY EFFICIENT CLUSTERING ALGORITHM
    MILGRAM, M
    DUBUISSON, B
    VACHON, B
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1977, 7 (02): : 99 - 104
  • [4] Computationally Efficient Bayesian Learning of Gaussian Process State Space Models
    Svensson, Andreas
    Solin, Arno
    Sarkka, Simo
    Schon, Thomas B.
    [J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 213 - 221
  • [5] Computationally efficient learning of multivariate t mixture models with missing information
    Lin, Tsung-I
    Ho, Hsiu J.
    Shen, Pao S.
    [J]. COMPUTATIONAL STATISTICS, 2009, 24 (03) : 375 - 392
  • [6] Computationally efficient learning of multivariate t mixture models with missing information
    Tsung-I Lin
    Hsiu J. Ho
    Pao S. Shen
    [J]. Computational Statistics, 2009, 24 : 375 - 392
  • [7] Towards Generative Design of Computationally Efficient Mathematical Models with Evolutionary Learning
    Kalyuzhnaya, Anna V.
    Nikitin, Nikolay O.
    Hvatov, Alexander
    Maslyaev, Mikhail
    Yachmenkov, Mikhail
    Boukhanovsky, Alexander
    [J]. ENTROPY, 2021, 23 (01) : 1 - 26
  • [8] A reliable and computationally efficient algorithm for imposing the saddle point property in dynamic models
    Anderson, GS
    [J]. COMPUTATION IN ECONOMICS, FINANCE AND ENGINEERING: ECONOMIC SYSTEMS, 2000, : 355 - 366
  • [9] A reliable and computationally efficient algorithm for imposing the saddle point property in dynamic models
    Anderson, Gary S.
    [J]. JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 2010, 34 (03): : 472 - 489
  • [10] A Computationally Efficient Gravitational Search Algorithm
    Rothwell, Alex
    Aleti, Aldeida
    [J]. PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCO'17 COMPANION), 2017, : 181 - 182