Momentum Online LDA for Large-scale Datasets

被引:2
|
作者
Ouyang, Jihong [1 ]
Lu, You [1 ]
Li, Ximing [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun, Peoples R China
关键词
TERM;
D O I
10.3233/978-1-61499-419-0-1075
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modeling large-scale document collections is a significant direction in machine learning research. Online LDA uses stochastic gradient optimization technology to speed the convergence; however the large noise of stochastic gradients leads to slower convergence and worse performance. In this paper, we employ the momentum term to smooth out the noise of stochastic gradients, and propose an extension of Online LDA, namely Momentum Online LDA (MOLDA). We collect a large-scale corpus consisting of 2M documents to evaluate our model. Experimental results indicate that MOLDA achieves faster convergence and better performance than the state-of-the-art.
引用
收藏
页码:1075 / 1076
页数:2
相关论文
共 50 条
  • [31] Distributed Sketched Subspace Clustering for Large-scale Datasets
    Traganitis, Panagiotis A.
    Giannakis, Georgios B.
    2017 IEEE 7TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING (CAMSAP), 2017,
  • [32] Understanding Data Similarity in Large-Scale Scientific Datasets
    Linton, Payton
    Melodia, William
    Lazar, Alina
    Agarwal, Deborah
    Bianchi, Ludovico
    Ghoshal, Devarshi
    Pastorello, Gilbert
    Ramakrishnan, Lavanya
    Wu, Kesheng
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 4525 - 4531
  • [33] Generative models and abstractions for large-scale neuroanatomy datasets
    Rolnick, David
    Dyer, Eva L.
    CURRENT OPINION IN NEUROBIOLOGY, 2019, 55 : 112 - 120
  • [34] A fast fuzzy clustering algorithm for large-scale datasets
    Shi, LK
    He, PL
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2005, 3584 : 203 - 208
  • [35] LARGE-SCALE DATASETS FOR GOING DEEPER IN IMAGE UNDERSTANDING
    Wu, Jiahong
    Zheng, He
    Zhao, Bo
    Li, Yixin
    Yan, Baoming
    Liang, Rui
    Wang, Wenjia
    Zhou, Shipei
    Lin, Guosen
    Fu, Yanwei
    Wang, Yizhou
    Wang, Yonggang
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1480 - 1485
  • [36] MFC Datasets: Large-Scale Benchmark Datasets for Media Forensic Challenge Evaluation
    Guan, Haiying
    Kozak, Mark
    Robertson, Eric
    Lee, Yooyoung
    Yates, Amy N.
    Delgado, Andrew
    Zhou, Daniel
    Kheyrkhah, Timothee
    Smith, Jeff
    Fiscus, Jonathan
    2019 IEEE WINTER APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW), 2019, : 63 - 72
  • [37] Demand Forecasting of Online Car-Hailing With Stacking Ensemble Learning Approach and Large-Scale Datasets
    Jin, Yuming
    Ye, Xiaofei
    Ye, Qiming
    Wang, Tao
    Cheng, Jun
    Yan, Xingchen
    IEEE ACCESS, 2020, 8 : 199513 - 199522
  • [38] THE SPECTRA OF LARGE-SCALE TURBULENT TRANSFER OF MOMENTUM AND HEAT
    CHIU, WC
    JOURNAL OF METEOROLOGY, 1960, 17 (04): : 435 - 441
  • [39] THE SPECTRUM OF LARGE-SCALE TURBULENT TRANSFER OF MOMENTUM AND HEAT
    ESTOQUE, MA
    TELLUS, 1955, 7 (02): : 177 - 185
  • [40] An Active Learning Based LDA Algorithm for Large-Scale Data Classification
    Yu X.
    Zhou Y.-P.
    Ren C.-N.
    Yu, Xu (yuxu0532@163.com), 1600, Science and Engineering Research Support Society (09): : 29 - 36