Automatic Pitch Accent Detection Using Long Short-Term Memory Neural Networks

被引:2
|
作者
Wu, Yizhi [1 ]
Li, Sha [1 ]
Li, Hongyan [1 ]
机构
[1] Donghua Univ, Coll Informat Sci & Technol, 2999 Renmin Rd North, Shanghai, Peoples R China
关键词
Pitch accent detection; LSTM; lexical and syntactic features; acoustic features;
D O I
10.1145/3364908.3365291
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prosody detection is gaining increasingly popularity in the domain of prosody research because of its significance in Text to Sound, Computer-aided pronunciation training (CAPT), etc. Pitch accent is an important part of prosody and many recognition models of both static and dynamic have been investigated for automatic labeling it. Recently, artificial neural networks, especially Recurrent Neural Networks (RNNs) have been applied in pitch accent detection. However, traditional recurrent neural networks are unable to learn and remember over long sequences due to the issue of back-propagated error decay. To solve this problem, this paper investigates the use of Long Short-Term Memory (LSTM) neural networks for automatic pitch accent detection. This paper encodes lexical and syntactic features as binary variables and uses syllable-based acoustic features including syllable duration, syllable energy, features related to the fundamental frequency. Our experimental results show that LSTM-RNNs for pitch accent detection achieves an accuracy of 89.0%, which is better than the results of using classical detection methods by about 83.2%.
引用
收藏
页码:41 / 45
页数:5
相关论文
共 50 条
  • [31] LATE REVERBERATION SUPPRESSION USING RECURRENT NEURAL NETWORKS WITH LONG SHORT-TERM MEMORY
    Zhao, Yan
    Wang, DeLiang
    Xu, Buye
    Zhang, Tao
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5434 - 5438
  • [32] Prediction of Pathological Tremor Signals Using Long Short-Term Memory Neural Networks
    Pascual-Valdunciel, Alejandro
    Lopo-Martinez, Victor
    Sendra-Arranz, Rafael
    Gonzalez-Sanchez, Miguel
    Perez-Sanchez, Javier Ricardo
    Grandas, Francisco
    Torricelli, Diego
    Moreno, Juan C.
    Oliveira Barroso, Filipe
    Pons, Jose L.
    Gutierrez, Alvaro
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (12) : 5930 - 5941
  • [33] Modelling energy demand response using long short-term memory neural networks
    Mesa Jimenez, Jose Joaquin
    Stokes, Lee
    Moss, Chris
    Yang, Qingping
    Livina, Valerie N.
    [J]. ENERGY EFFICIENCY, 2020, 13 (06) : 1263 - 1280
  • [34] Forecasting Methane Data Using Multivariate Long Short-Term Memory Neural Networks
    Luo, Ran
    Wang, Jingyi
    Gates, Ian
    [J]. ENVIRONMENTAL MODELING & ASSESSMENT, 2024, 29 (03) : 441 - 454
  • [35] Industrial Financial Forecasting using Long Short-Term Memory Recurrent Neural Networks
    Ali, Muhammad Mohsin
    Babar, Muhammad Imran
    Hamza, Muhammad
    Jehanzeb, Muhammad
    Habib, Saad
    Khan, Muhammad Sajid
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (04) : 88 - 99
  • [36] Statistical downscaling of precipitation using long short-term memory recurrent neural networks
    Saptarshi Misra
    Sudeshna Sarkar
    Pabitra Mitra
    [J]. Theoretical and Applied Climatology, 2018, 134 : 1179 - 1196
  • [37] Modelling energy demand response using long short-term memory neural networks
    JoséJoaquìn Mesa Jiménez
    Lee Stokes
    Chris Moss
    Qingping Yang
    Valerie N. Livina
    [J]. Energy Efficiency, 2020, 13 : 1263 - 1280
  • [38] Statistical downscaling of precipitation using long short-term memory recurrent neural networks
    Misra, Saptarshi
    Sarkar, Sudeshna
    Mitra, Pabitra
    [J]. THEORETICAL AND APPLIED CLIMATOLOGY, 2018, 134 (3-4) : 1179 - 1196
  • [39] Automatic Lip Reading Using Convolution Neural Network and Bidirectional Long Short-term Memory
    Lu, Yuanyao
    Yan, Jie
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2020, 34 (01)
  • [40] PICO Element Detection in Medical Text via Long Short-Term Memory Neural Networks
    Jin, Di
    Szolovits, Peter
    [J]. SIGBIOMED WORKSHOP ON BIOMEDICAL NATURAL LANGUAGE PROCESSING (BIONLP 2018), 2018, : 67 - 75