A Spatial-Temporal Graph Model for Pronunciation Feature Prediction of Chinese Poetry

被引:11
|
作者
Wang, Qing [1 ,2 ]
Liu, Weiping [1 ,2 ]
Wang, Xiumei [1 ,2 ]
Chen, Xinghong [1 ,2 ]
Chen, Guannan [1 ,2 ]
Wu, Qingxiang [1 ,2 ]
机构
[1] Fujian Normal Univ, Minist Educ, Key Lab Optoelect Sci & Technol Med, Fuzhou 350007, Peoples R China
[2] Fujian Normal Univ, Fujian Prov Key Lab Photon Technol, Fuzhou 350007, Peoples R China
关键词
Mel frequency cepstral coefficient; Predictive models; Rhythm; Feature extraction; Analytical models; Data models; Computational modeling; AGRU; Chinese poetry; encoder-decoder; graph modeling; Mel frequency cepstral coefficient (MFCC); pronunciation features; spatial-temporal graph model (STGM-MHA); TO-SPEECH SYNTHESIS; DEREVERBERATION; RECOGNITION; TRANSLATION; FILTER;
D O I
10.1109/TNNLS.2022.3165554
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the development of artificial intelligence, speech recognition and prediction have become one of the important research domains with wild applications, such as intelligent control, education, individual identification, and emotion analysis. Chinese poetry reading contains rich features of continuous pronunciations, such as mood, emotion, rhythm schemes, lyric reading, and artistic expression. Therefore, the prediction of the pronunciation characteristics of a Chinese poetry reading is the significance for the presentation of high-level machine intelligence and has the potential to create a high-level intelligent system for teaching children to read Tang poetry. Mel frequency cepstral coefficient (MFCC) is currently used to present important speech features. Due to the complexity and high degree of nonlinearity in poetry reading, however, there is a tough challenge facing accurate pronunciation feature prediction, that is, how to model complex spatial correlations and time dynamics, such as rhyme schemes. As for many current methods, they ignore the spatial and temporal characteristics in MFCC presentation. In addition, these methods are subjected to certain limitations on prediction for long-term performance. In order to solve these problems, we propose a novel spatial-temporal graph model (STGM-MHA) based on multihead attention for the purpose of pronunciation feature prediction of Chinese poetry. The STGM-MHA is designed using an encoder-decoder structure. The encoder compresses the data into a hidden space representation, while the decoder reconstructs the hidden space representation as output. In the model, a novel gated recurrent unit (GRU) module (AGRU) based on multihead attention is proposed to extract the spatial and temporal features of MFCC data effectively. The evaluation comparison of our proposed model versus state-of-the-art methods in six datasets reveals the clear advantage of the proposed model.
引用
收藏
页码:10294 / 10308
页数:15
相关论文
共 50 条
  • [1] Dynamic Spatial-Temporal Graph Model for Disease Prediction
    Senthilkumar, Ashwin
    Gupte, Mihir
    Shridevi, S.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (06) : 950 - 957
  • [2] Learning spatial-temporal feature with graph product
    Tan, Zhuo
    Zhu, Yifan
    Liu, Bin
    [J]. SIGNAL PROCESSING, 2023, 210
  • [3] STMG: Spatial-Temporal Mobility Graph for Location Prediction
    Pan, Xuan
    Cai, Xiangrui
    Zhang, Jiangwei
    Wen, Yanlong
    Zhang, Ying
    Yuan, Xiaojie
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT I, 2021, 12681 : 667 - 675
  • [4] Graph Spatial-Temporal Transformer Network for Traffic Prediction
    Zhao, Zhenzhen
    Shen, Guojiang
    Wang, Lei
    Kong, Xiangjie
    [J]. BIG DATA RESEARCH, 2024, 36
  • [5] Spatial-temporal knowledge graph network for event prediction
    Huai, Zepeng
    Zhang, Dawei
    Yang, Guohua
    Tao, Jianhua
    [J]. NEUROCOMPUTING, 2023, 553
  • [6] STGSA: A Novel Spatial-Temporal Graph Synchronous Aggregation Model for Traffic Prediction
    Wei, Zebing
    Zhao, Hongxia
    Li, Zhishuai
    Bu, Xiaojie
    Chen, Yuanyuan
    Zhang, Xiqiao
    Lv, Yisheng
    Wang, Fei-Yue
    [J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2023, 10 (01) : 226 - 238
  • [7] STGSA: A Novel Spatial-Temporal Graph Synchronous Aggregation Model for Traffic Prediction
    Zebing Wei
    Hongxia Zhao
    Zhishuai Li
    Xiaojie Bu
    Yuanyuan Chen
    Xiqiao Zhang
    Yisheng Lv
    Fei-Yue Wang
    [J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10 (01) : 226 - 238
  • [8] Deep spatial-temporal travel time prediction model based on trajectory feature
    Sheng, Zhaoyu
    Lv, Zhiqiang
    Li, Jianbo
    Xu, Zhihao
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2023, 110
  • [9] Spatial-Temporal Traffic Prediction With an Interactive Spatial-Enhanced Graph Convolutional Network Model
    Li, Qin
    Xu, Pai
    Yang, Xuan
    Wu, Yuankai
    He, Hongwen
    He, Deqiang
    [J]. IEEE Transactions on Intelligent Transportation Systems, 2024, 25 (12) : 20767 - 20778
  • [10] Capturing Local and Global Spatial-Temporal Correlations of Spatial-Temporal Graph Data for Traffic Flow Prediction
    Cao, Shuqin
    Wu, Libing
    Zhang, Rui
    Li, Jianxin
    Wu, Dan
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,