A Spatial-Temporal Graph Model for Pronunciation Feature Prediction of Chinese Poetry

被引:13
|
作者
Wang, Qing [1 ,2 ]
Liu, Weiping [1 ,2 ]
Wang, Xiumei [1 ,2 ]
Chen, Xinghong [1 ,2 ]
Chen, Guannan [1 ,2 ]
Wu, Qingxiang [1 ,2 ]
机构
[1] Fujian Normal Univ, Minist Educ, Key Lab Optoelect Sci & Technol Med, Fuzhou 350007, Peoples R China
[2] Fujian Normal Univ, Fujian Prov Key Lab Photon Technol, Fuzhou 350007, Peoples R China
关键词
Mel frequency cepstral coefficient; Predictive models; Rhythm; Feature extraction; Analytical models; Data models; Computational modeling; AGRU; Chinese poetry; encoder-decoder; graph modeling; Mel frequency cepstral coefficient (MFCC); pronunciation features; spatial-temporal graph model (STGM-MHA); TO-SPEECH SYNTHESIS; DEREVERBERATION; RECOGNITION; TRANSLATION; FILTER;
D O I
10.1109/TNNLS.2022.3165554
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the development of artificial intelligence, speech recognition and prediction have become one of the important research domains with wild applications, such as intelligent control, education, individual identification, and emotion analysis. Chinese poetry reading contains rich features of continuous pronunciations, such as mood, emotion, rhythm schemes, lyric reading, and artistic expression. Therefore, the prediction of the pronunciation characteristics of a Chinese poetry reading is the significance for the presentation of high-level machine intelligence and has the potential to create a high-level intelligent system for teaching children to read Tang poetry. Mel frequency cepstral coefficient (MFCC) is currently used to present important speech features. Due to the complexity and high degree of nonlinearity in poetry reading, however, there is a tough challenge facing accurate pronunciation feature prediction, that is, how to model complex spatial correlations and time dynamics, such as rhyme schemes. As for many current methods, they ignore the spatial and temporal characteristics in MFCC presentation. In addition, these methods are subjected to certain limitations on prediction for long-term performance. In order to solve these problems, we propose a novel spatial-temporal graph model (STGM-MHA) based on multihead attention for the purpose of pronunciation feature prediction of Chinese poetry. The STGM-MHA is designed using an encoder-decoder structure. The encoder compresses the data into a hidden space representation, while the decoder reconstructs the hidden space representation as output. In the model, a novel gated recurrent unit (GRU) module (AGRU) based on multihead attention is proposed to extract the spatial and temporal features of MFCC data effectively. The evaluation comparison of our proposed model versus state-of-the-art methods in six datasets reveals the clear advantage of the proposed model.
引用
收藏
页码:10294 / 10308
页数:15
相关论文
共 50 条
  • [31] Sparse Transformer Network With Spatial-Temporal Graph for Pedestrian Trajectory Prediction
    Gao, Long
    Gu, Xiang
    Chen, Feng
    Wang, Jin
    IEEE ACCESS, 2024, 12 : 144725 - 144737
  • [32] Spatial-Temporal Attention Network for Crime Prediction with Adaptive Graph Learning
    Sun, Mingjie
    Zhou, Pengyuan
    Tian, Hui
    Liao, Yong
    Xie, Haiyong
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT II, 2022, 13530 : 656 - 669
  • [33] Combining random forest and graph wavenet for spatial-temporal data prediction
    Chen C.
    Xu Y.
    Zhao J.
    Chen L.
    Xue Y.
    Intelligent and Converged Networks, 2022, 3 (04): : 364 - 377
  • [34] DyAdapTransformer: Dynamic Adaptive Spatial-Temporal Graph Transformer for Traffic Prediction
    Dong, Hui
    Pan, Xiao
    Chen, Xiao
    Sun, Jing
    Wang, Shuhai
    SPATIAL DATA AND INTELLIGENCE, SPATIALDI 2024, 2024, 14619 : 228 - 241
  • [35] Spatial-Temporal Tensor Graph Convolutional Network for Traffic Speed Prediction
    Xu, Xuran
    Zhang, Tong
    Xu, Chunyan
    Cui, Zhen
    Yang, Jian
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (01) : 92 - 103
  • [36] Spatial-Temporal Dual Graph Neural Network for Pedestrian Trajectory Prediction
    Zou, Yuming
    Piao, Xinglin
    Zhang, Yong
    Hu, Yongli
    39TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION, YAC 2024, 2024, : 1212 - 1217
  • [37] STFGCN: Spatial-temporal fusion graph convolutional network for traffic prediction
    Li, Hao
    Liu, Jie
    Han, Shiyuan
    Zhou, Jin
    Zhang, Tong
    Chen, C. L. Philip
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [38] Adaptive spatial-temporal graph attention network for traffic speed prediction
    Zhang, Xijun
    Zhang, Baoqi
    Zhang, Hong
    Nie, Shengyuan
    Zhang, Xianli
    High Technology Letters, 2024, 30 (03) : 221 - 230
  • [39] Fault Prediction for Electromechanical Equipment Based on Spatial-Temporal Graph Information
    Zhang, Xiaofei
    Long, Zhuo
    Peng, Jian
    Wu, Gongping
    Hu, Haifeng
    Lyu, MingCheng
    Qin, Guojun
    Song, Dianyi
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (02) : 1413 - 1424
  • [40] Spatial-Temporal Dynamic Graph Convolutional Neural Network for Traffic Prediction
    Xiao, Wenjuan
    Wang, Xiaoming
    IEEE ACCESS, 2023, 11 : 97920 - 97929