Sequence Modeling with Hierarchical Deep Generative Models with Dual Memory

被引:0
|
作者
Zheng, Yanan [1 ]
Wen, Lijie [1 ]
Wang, Jianmin [1 ]
Yan, Jun [2 ]
Ji, Lei [2 ]
机构
[1] Tsinghua Univ, Beijing 100084, Peoples R China
[2] Microsoft Res Asia, Dan Ling St, Beijing 100080, Peoples R China
关键词
Sequence Modeling; Hierarchical Deep Generative Models; Dual Memory Mechanism; Inference and Learning;
D O I
10.1145/3132847.3132952
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep Generative Models (DGMs) are able to extract high-level representations from massive unlabeled data and are explainable from a probabilistic perspective. Such characteristics favor sequence modeling tasks. However, it still remains a huge challenge to model sequences with DGMs. Unlike real-valued data that can be directly fed into models, sequence data consist of discrete elements and require being transformed into certain representations first. This leads to the following two challenges. First, high-level features are sensitive to small variations of inputs as well as the way of representing data. Second, the models are more likely to lose long-term information during multiple transformations. In this paper, we propose a Hierarchical Deep Generative Model With Dual Memory to address the two challenges. Furthermore, we provide a method to efficiently perform inference and learning on the model. The proposed model extends basic DGMs with an improved hierarchically organized multi-layer architecture. Besides, our model incorporates memories along dual directions, respectively denoted as broad memory and deep memory. The model is trained end-to-end by optimizing a variational lower bound on data log-likelihood using the improved stochastic variational method. We perform experiments on several tasks with various datasets and obtain excellent results. The results of language modeling show our method significantly outperforms state-of-the-art results in terms of generative performance. Extended experiments including document modeling and sentiment analysis, prove the high-effectiveness of dual memory mechanism and latent representations. Text random generation provides a straightforward perception for advantages of our model.
引用
下载
收藏
页码:1369 / 1378
页数:10
相关论文
共 50 条
  • [1] An Architecture for Deep, Hierarchical Generative Models
    Bachman, Philip
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [2] Protein sequence design with deep generative models
    Wu, Zachary
    Johnston, Kadina E.
    Arnold, Frances H.
    Yang, Kevin K.
    [J]. CURRENT OPINION IN CHEMICAL BIOLOGY, 2021, 65 : 18 - 27
  • [3] Learning Hierarchical Features from Deep Generative Models
    Zhao, Shengjia
    Song, Jiaming
    Ermon, Stefano
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [4] Affibody Sequence Design Using Deep Learning Generative Models
    Wang, Zirui
    Mardikoraem, Mehrsa
    Woldring, Daniel
    [J]. PROTEIN SCIENCE, 2023, 32
  • [5] Neurosymbolic Deep Generative Models for Sequence Data with Relational Constraints
    Young, Halley
    Du, Maxwell
    Bastani, Osbert
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [6] Online Anomalous Trajectory Detection with Deep Generative Sequence Modeling
    Liu, Yiding
    Zhao, Kaiqi
    Cong, Gao
    Bao, Zhifeng
    [J]. 2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020), 2020, : 949 - 960
  • [7] Explainable Anatomical Shape Analysis Through Deep Hierarchical Generative Models
    Biffi, Carlo
    Cerrolaza, Juan J.
    Tarroni, Giacomo
    Bai, Wenjia
    de Marvao, Antonio
    Oktay, Ozan
    Ledig, Christian
    Le Folgoc, Loic
    Kamnitsas, Konstantinos
    Doumou, Georgia
    Duan, Jinming
    Prasad, Sanjay K.
    Cook, Stuart A.
    O'Regan, Declan P.
    Rueckert, Daniel
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2020, 39 (06) : 2088 - 2099
  • [8] Hierarchical Deep Generative Models for Multi-Rate Multivariate Time Series
    Che, Zhengping
    Purushotham, Sanjay
    Li, Guangyu
    Jiang, Bo
    Liu, Yan
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [9] Generative models for protein sequence modeling: recent advances and future directions
    Mardikoraem, Mehrsa
    Wang, Zirui
    Pascual, Nathaniel
    Woldring, Daniel
    [J]. BRIEFINGS IN BIOINFORMATICS, 2023, 24 (06)
  • [10] GENERATIVE MODELING OF TEMPORAL SIGNAL FEATURES USING HIERARCHICAL PROBABILISTIC GRAPHICAL MODELS
    Gang, Ren
    Bocko, Gregory
    Lundberg, Justin
    Headlam, Dave
    Bocko, Mark F.
    [J]. 2011 IEEE DIGITAL SIGNAL PROCESSING WORKSHOP AND IEEE SIGNAL PROCESSING EDUCATION WORKSHOP (DSP/SPE), 2011, : 307 - 312