A Transformer-Based Hierarchical Variational AutoEncoder Combined Hidden Markov Model for Long Text Generation

被引:8
|
作者
Zhao, Kun [1 ]
Ding, Hongwei [1 ]
Ye, Kai [1 ]
Cui, Xiaohui [1 ]
机构
[1] Wuhan Univ, Sch Cyber Sci & Engn, Key Lab Aerosp Informat Secur & Trusted Comp, Minist Educ, Wuhan 430072, Peoples R China
关键词
Variational AutoEncoder; text generation; Hidden Markov Model; Transformer; latent variables;
D O I
10.3390/e23101277
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
The Variational AutoEncoder (VAE) has made significant progress in text generation, but it focused on short text (always a sentence). Long texts consist of multiple sentences. There is a particular relationship between each sentence, especially between the latent variables that control the generation of the sentences. The relationships between these latent variables help in generating continuous and logically connected long texts. There exist very few studies on the relationships between these latent variables. We proposed a method for combining the Transformer-Based Hierarchical Variational AutoEncoder and Hidden Markov Model (HT-HVAE) to learn multiple hierarchical latent variables and their relationships. This application improves long text generation. We use a hierarchical Transformer encoder to encode the long texts in order to obtain better hierarchical information of the long text. HT-HVAE's generation network uses HMM to learn the relationship between latent variables. We also proposed a method for calculating the perplexity for the multiple hierarchical latent variable structure. The experimental results show that our model is more effective in the dataset with strong logic, alleviates the notorious posterior collapse problem, and generates more continuous and logically connected long text.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] TBNF:A Transformer-based Noise Filtering Method for Chinese Long-form Text Matching
    Gan, Ling
    Hu, Liuhui
    Tan, Xiaodong
    Du, Xinrui
    APPLIED INTELLIGENCE, 2023, 53 (19) : 22313 - 22327
  • [42] TBNF:A Transformer-based Noise Filtering Method for Chinese Long-form Text Matching
    Ling Gan
    Liuhui Hu
    Xiaodong Tan
    Xinrui Du
    Applied Intelligence, 2023, 53 : 22313 - 22327
  • [43] Text information extraction based on the second-order hidden Markov model
    College of Computer and Communication, Hunan University, Changsha 410082, China
    不详
    Tien Tzu Hsueh Pao, 2007, 11 (2226-2231):
  • [44] A fuzzy synset-based hidden Markov model for automatic text segmentation
    Ha-Thuc, Viet
    Nguyen-Van, Quang-Anh
    Cao, Tru Hoang
    Lawry, Jonathan
    SOFT METHODS FOR INTEGRATED UNCERTAINTY MODELLING, 2006, : 365 - +
  • [45] Transformer-Based Seq2Seq Model for Chord Progression Generation
    Li, Shuyu
    Sung, Yunsick
    MATHEMATICS, 2023, 11 (05)
  • [46] DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation
    Hong, Seongmin
    Moon, Seungjae
    Kim, Junsoo
    Lee, Sungjae
    Kim, Minsub
    Lee, Dongsoo
    Kim, Joo-Young
    2022 55TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2022, : 616 - 630
  • [47] Simplification of Arabic text: A hybrid approach integrating machine translation and transformer-based lexical model
    Al-Thanyyan, Suha S.
    Azmi, Aqil M.
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (08)
  • [48] Position-context additive transformer-based model for classifying text data on social media
    Abd-Elaziz, M. M.
    El-Rashidy, Nora
    Abou Elfetouh, Ahmed
    El-Bakry, Hazem M.
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [49] AraCovTexFinder: Leveraging the transformer-based language model for Arabic COVID-19 text identification
    Hossain, Md. Rajib
    Hoque, Mohammed Moshiul
    Siddique, Nazmul
    Dewan, Ali Akber
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [50] Emo-TTS: Parallel Transformer-based Text-to-Speech Model with Emotional Awareness
    Osman, Mohamed
    5TH INTERNATIONAL CONFERENCE ON COMPUTING AND INFORMATICS (ICCI 2022), 2022, : 169 - 174