Prediction of Various Backchannel Utterances Based on Multimodal Information

被引:0
|
作者
Onishi, Toshiki [1 ]
Azuma, Naoki [1 ]
Kinoshita, Shunichi [1 ]
Ishii, Ryo [2 ]
Fukayama, Atsushi [2 ]
Nakamura, Takao [2 ]
Miyata, Akihiro [1 ]
机构
[1] Nihon Univ, Tokyo, Japan
[2] NTT Corp, Yokohama, Kanagawa, Japan
来源
PROCEEDINGS OF THE 23RD ACM INTERNATIONAL CONFERENCE ON INTELLIGENT VIRTUAL AGENTS, IVA 2023 | 2023年
关键词
multimodal interaction; communication; backchannel; TURN-TAKING; JAPANESE; FEATURES; ENGLISH;
D O I
10.1145/3570945.3607298
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The listener's backchannels are an important part of dialogues. With appropriate backchannels, people are able to smoothly promote dialogues. Thus, backchannels are considered to be important in dialogues between not only humans but also humans and agents. Progress has been made in studying dialogue agents that perform natural affable dialogue. However, we have not clarified whether the listener's various backchannel types are predictable using the speaker's multimodal information. In this paper, we attempt to predict a listener's various backchannel types on the basis of the speaker's multimodal information in dialogues. First, we construct a dialogue corpus that consists of multimodal information of a speaker's utterances and a listener's backchannels. Second, we construct machine learning models to predict a listener's various backchannel types on the basis of a speaker's multimodal information. Our results suggest that our model was able to predict a listener's various backchannel types on the basis of a speaker's multimodal information.
引用
收藏
页数:4
相关论文
共 50 条
  • [21] Novel tool wear prediction method based on multimodal information fusion and deep subdomain adaptation
    Hou, Wen
    Wang, Jiachang
    Wang, Leilei
    Zhang, Song
    MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2025, 224
  • [22] Emotion Recognition Based on Multimodal Information
    Zeng, Zhihong
    Pantic, Maja
    Huang, Thomas S.
    AFFECTIVE INFORMATION PROCESSING, 2009, : 241 - +
  • [23] The information structure of the aphasic and child utterances
    Avrutin, S
    BRAIN AND LANGUAGE, 2001, 79 (01) : 111 - 113
  • [24] Rapid Vehicle Trajectory Prediction Based on Multi-Attention Mechanism for Fusing Multimodal Information
    Ge, Likun
    Wang, Shuting
    Wang, Guangqi
    ELECTRONICS, 2024, 13 (23):
  • [25] Joint streaming model for backchannel prediction and automatic speech recognition
    Choi, Yong-Seok
    Bang, Jeong-Uk
    Kim, Seung Hi
    ETRI JOURNAL, 2024, 46 (01) : 118 - 126
  • [26] Attention-Based Fusion of Ultrashort Voice Utterances and Depth Videos for Multimodal Person Identification
    Moufidi, Abderrazzaq
    Rousseau, David
    Rasti, Pejman
    SENSORS, 2023, 23 (13)
  • [27] Sememe Prediction for BabelNet Synsets using Multilingual and Multimodal Information
    Qi, Fanchao
    Lv, Chuancheng
    Liu, Zhiyuan
    Meng, Xiaojun
    Sun, Maosong
    Zheng, Hai-Tao
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 158 - 168
  • [28] Multimodal representation models for prediction and control from partial information
    Zambelli, Martina
    Cully, Antoine
    Demiris, Yiannis
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2020, 123 (123)
  • [29] Prediction of histopathological outcome using averaged multimodal information in rat
    Zhao, WZ
    Ginsberg, MD
    Belayev, L
    Truettner, J
    Schmidt-Kastner, R
    PROCEEDINGS OF THE 20TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOL 20, PTS 1-6: BIOMEDICAL ENGINEERING TOWARDS THE YEAR 2000 AND BEYOND, 1998, 20 : 2082 - 2083
  • [30] Jointly Learning Agent and Lane Information for Multimodal Trajectory Prediction
    Wang, Jie
    Guo, Caili
    Guo, Minan
    Chen, Jiujiu
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2921 - 2927