Speech-to-Gesture Generation: A Challenge in Deep Learning Approach with Bi-Directional LSTM

被引:24
|
作者
Takeuchi, Kenta [1 ]
Hasegawa, Dai [2 ]
Shirakawa, Shinichi [3 ]
Kaneko, Naoshi [2 ]
Sakuta, Hiroshi [2 ]
Sumi, Kazuhiko [2 ]
机构
[1] Aoyama Gakuin Univ, Grad Sch Sci & Engn, Sagamihara, Kanagawa, Japan
[2] Aoyama Gakuin Univ, Coll Sci & Engn, Sagamihara, Kanagawa, Japan
[3] Yokohama Natl Univ, Fac Environm & Informat Sci, Yokohama, Kanagawa, Japan
关键词
Deep Learning; Gesture Generation; Bi-Directional LSTM; Speech Features;
D O I
10.1145/3125739.3132594
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this research, we take a first step in generating motion data for gestures directly from speech features. Such a method can make creating gesture animations for Embodied Conversational Agents much easier. We implemented a model using Bi-Directional LSTM taking phonemic features from speech audio data as input to output time sequence data of rotations of bone joints. We assessed the validity of the predicted gesture motion data by evaluating the final loss value of the network, and evaluating the impressions of the predicted gesture by comparing it with the actual motion data that accompanied the audio data used for input and motion data that accompanied a different audio data. The results showed that the accuracy of the prediction for the LSTM model was better than a simple RNN model. In contrast, the impressions evaluation of the predicted gesture was rated lower than the original and mismatched gestures, although individually some predicted gestures were rated the same degree as the mismatched gestures.
引用
收藏
页码:365 / 369
页数:5
相关论文
共 50 条
  • [1] Bi-directional lstm network speech-to-gesture generation using bi-directional lstm network
    Kaneko N.
    Takeuchi K.
    Hasegawa D.
    Shirakawa S.
    Sakuta H.
    Sumi K.
    Transactions of the Japanese Society for Artificial Intelligence, 2019, 34 (06):
  • [2] Evaluation of Speech-to-Gesture Generation Using Bi-Directional LSTM Network
    Hasegawa, Dai
    Kaneko, Naoshi
    Shirakawa, Shinichi
    Sakuta, Hiroshi
    Sumi, Kazuhiko
    18TH ACM INTERNATIONAL CONFERENCE ON INTELLIGENT VIRTUAL AGENTS (IVA'18), 2018, : 79 - 86
  • [3] Forecasting Cryptocurrency Prices Using LSTM, GRU, and Bi-Directional LSTM: A Deep Learning Approach
    Seabe, Phumudzo Lloyd
    Moutsinga, Claude Rodrigue Bambe
    Pindza, Edson
    FRACTAL AND FRACTIONAL, 2023, 7 (02)
  • [4] Speech emotion recognition based on Bi-directional LSTM architecture and deep belief networks
    Senthilkumar, N.
    Karpakam, S.
    Devi, M. Gayathri
    Balakumaresan, R.
    Dhilipkumar, P.
    MATERIALS TODAY-PROCEEDINGS, 2022, 57 : 2180 - 2184
  • [5] Prediction of rebound in shotcrete using deep bi-directional LSTM
    Suzen, Ahmet A.
    Cakiroglu, Melda A.
    COMPUTERS AND CONCRETE, 2019, 24 (06): : 555 - 560
  • [6] Deep Bi-Directional LSTM Network for Query Intent Detection
    Sreelakshmi, K.
    Rafeeque, P. C.
    Sreetha, S.
    Gayathri, E. S.
    8TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING & COMMUNICATIONS (ICACC-2018), 2018, 143 : 939 - 946
  • [7] Part-Of-Speech Tagger in Malayalam Using Bi-directional LSTM
    Rajan, Rajeev
    Joseph, Anna J.
    Robin, Elizabeth K.
    Nishma, Fathima T. K.
    PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 22 - 27
  • [8] A Bi-directional LSTM Approach for Polyphone Disambiguation in Mandarin Chinese
    Shan, Changhao
    Xie, Lei
    Yao, Kaisheng
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [9] A Deep Learning Method Based Self-Attention and Bi-directional LSTM in Emotion Classification
    Fei, Rong
    Zhu, Yuanbo
    Yao, Quanzhu
    Xu, Qingzheng
    Hu, Bo
    JOURNAL OF INTERNET TECHNOLOGY, 2020, 21 (05): : 1447 - 1461
  • [10] A hybrid deep learning framework with CNN and Bi-directional LSTM for store item demand forecasting
    Joseph, Reuben Varghese
    Mohanty, Anshuman
    Tyagi, Soumyae
    Mishra, Shruti
    Satapathy, Sandeep Kumar
    Mohanty, Sachi Nandan
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 103