Sequence Modeling with Hierarchical Deep Generative Models with Dual Memory

被引:0
|
作者
Zheng, Yanan [1 ]
Wen, Lijie [1 ]
Wang, Jianmin [1 ]
Yan, Jun [2 ]
Ji, Lei [2 ]
机构
[1] Tsinghua Univ, Beijing 100084, Peoples R China
[2] Microsoft Res Asia, Dan Ling St, Beijing 100080, Peoples R China
关键词
Sequence Modeling; Hierarchical Deep Generative Models; Dual Memory Mechanism; Inference and Learning;
D O I
10.1145/3132847.3132952
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep Generative Models (DGMs) are able to extract high-level representations from massive unlabeled data and are explainable from a probabilistic perspective. Such characteristics favor sequence modeling tasks. However, it still remains a huge challenge to model sequences with DGMs. Unlike real-valued data that can be directly fed into models, sequence data consist of discrete elements and require being transformed into certain representations first. This leads to the following two challenges. First, high-level features are sensitive to small variations of inputs as well as the way of representing data. Second, the models are more likely to lose long-term information during multiple transformations. In this paper, we propose a Hierarchical Deep Generative Model With Dual Memory to address the two challenges. Furthermore, we provide a method to efficiently perform inference and learning on the model. The proposed model extends basic DGMs with an improved hierarchically organized multi-layer architecture. Besides, our model incorporates memories along dual directions, respectively denoted as broad memory and deep memory. The model is trained end-to-end by optimizing a variational lower bound on data log-likelihood using the improved stochastic variational method. We perform experiments on several tasks with various datasets and obtain excellent results. The results of language modeling show our method significantly outperforms state-of-the-art results in terms of generative performance. Extended experiments including document modeling and sentiment analysis, prove the high-effectiveness of dual memory mechanism and latent representations. Text random generation provides a straightforward perception for advantages of our model.
引用
下载
收藏
页码:1369 / 1378
页数:10
相关论文
共 50 条
  • [31] Denoising Deep Generative Models
    Loaiza-Ganem, Gabriel
    Ross, Brendan Leigh
    Wu, Luhuan
    Cunningham, John P.
    Cresswell, Jesse C.
    Caterini, Anthony L.
    PROCEEDINGS ON I CAN'T BELIEVE IT'S NOT BETTER! - UNDERSTANDING DEEP LEARNING THROUGH EMPIRICAL FALSIFICATION, VOL 187, 2022, 187 : 41 - 50
  • [32] An Overview of Deep Generative Models
    Xu, Jungang
    Li, Hui
    Zhou, Shilong
    IETE TECHNICAL REVIEW, 2015, 32 (02) : 131 - 139
  • [33] Sequence-to-Set Generative Models
    Tang, Longtao
    Zhou, Ying
    Yang, Yu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [34] Performance Analysis of Sparse Matrix Representation in Hierarchical Temporal Memory for Sequence Modeling
    Pilinszki-Nagy, Csongor
    Gyires-Toth, Balint
    INFOCOMMUNICATIONS JOURNAL, 2020, 12 (02): : 41 - 49
  • [35] Hierarchical approach to building generative networkload models
    Indian Inst of Technology, Madras, India
    Comput Networks ISDN Syst, 7 (1193-1206):
  • [36] Generative Hierarchical Learning of Sparse FRAME Models
    Xie, Jianwen
    Xu, Yifei
    Nijkamp, Erik
    Wu, Ying Nian
    Zhu, Song-Chun
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1933 - 1941
  • [37] HIERARCHICAL APPROACH TO BUILDING GENERATIVE NETWORKLOAD MODELS
    RAGHAVAN, SV
    VASUKIAMMAIYAR, D
    HARING, G
    COMPUTER NETWORKS AND ISDN SYSTEMS, 1995, 27 (07): : 1193 - 1206
  • [38] Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models
    Che, Tong
    Liu, Xiaofeng
    Li, Site
    Ge, Yubin
    Zhang, Ruixiang
    Xiong, Caiming
    Bengio, Yoshua
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 7002 - 7010
  • [39] Performance-Based Generative Design for Parametric Modeling of Engineering Structures Using Deep Conditional Generative Models
    Bucher, Martin Juan Jose
    Kraus, Michael Anton
    Rust, Romana
    Tang, Siyu
    AUTOMATION IN CONSTRUCTION, 2023, 156
  • [40] Variational Memory Addressing in Generative Models
    Bornschein, Jorg
    Mnih, Andriy
    Zoran, Daniel
    Rezende, Danilo J.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30