Prediction of Who Will Be the Next Speaker and When Using Gaze Behavior in Multiparty Meetings

被引:40
|
作者
Ishii, Ryo [1 ,2 ]
Otsuka, Kazuhiro [1 ,2 ]
Kumano, Shiro [1 ,2 ]
Yamato, Junji [1 ,2 ]
机构
[1] NTT Corp, Tokyo, Japan
[2] 3-1 Morinosato Wakamiya, Atsugi, Kanagawa 2430198, Japan
关键词
Turn-changing; multiparty meetings; gaze behavior; next speaker prediction; speech timing prediction;
D O I
10.1145/2757284
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In multiparty meetings, participants need to predict the end of the speaker's utterance and who will start speaking next, as well as consider a strategy for good timing to speak next. Gaze behavior plays an important role in smooth turn-changing. This article proposes a prediction model that features three processing steps to predict (I) whether turn-changing or turn-keeping will occur, (II) who will be the next speaker in turn-changing, and (III) the timing of the start of the next speaker's utterance. For the feature values of the model, we focused on gaze transition patterns and the timing structure of eye contact between a speaker and a listener near the end of the speaker's utterance. Gaze transition patterns provide information about the order in which gaze behavior changes. The timing structure of eye contact is defined as who looks at whom and who looks away first, the speaker or listener, when eye contact between the speaker and a listener occurs. We collected corpus data of multiparty meetings, using the data to demonstrate relationships between gaze transition patterns and timing structure and situations (I), (II), and (III). The results of our analyses indicate that the gaze transition pattern of the speaker and listener and the timing structure of eye contact have a strong association with turn-changing, the next speaker in turn-changing, and the start time of the next utterance. On the basis of the results, we constructed prediction models using the gaze transition patterns and timing structure. The gaze transition patterns were found to be useful in predicting turn-changing, the next speaker in turn-changing, and the start time of the next utterance. Contrary to expectations, we did not find that the timing structure is useful for predicting the next speaker and the start time. This study opens up new possibilities for predicting the next speaker and the timing of the next utterance using gaze transition patterns in multiparty meetings.
引用
收藏
页数:31
相关论文
共 18 条
  • [1] Using Respiration to Predict Who Will Speak Next and When in Multiparty Meetings
    Ishii, Ryo
    Otsuka, Kazuhiro
    Kumano, Shiro
    Yamato, Junji
    ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2016, 6 (02)
  • [2] Who Speaks Next? Turn Change and Next Speaker Prediction in Multimodal Multiparty Interaction
    Malik, Usman
    Saunier, Julien
    Funakoshi, Kotaro
    Pauchet, Alexandre
    2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 349 - 354
  • [3] Predicting Who Will Be the Next Speaker and When in Multi-party Meetings
    Ishii, Ryo
    Otsuka, Kazuhiro
    Kumano, Shiro
    Yamato, Junji
    NTT Technical Review, 2015, 13 (07):
  • [4] Multimodal Fusion using Respiration and Gaze for Predicting Next Speaker in Multi-Party Meetings
    Ishii, Ryo
    Kumano, Shiro
    Otsuka, Kazuhiro
    ICMI'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2015, : 99 - 106
  • [5] Prediction of Who Will Be Next Speaker and When Using Mouth-Opening Pattern in Multi-Party Conversation
    Ishii, Ryo
    Otsuka, Kazuhiro
    Kumano, Shiro
    Higashinaka, Ryuichiro
    Tomita, Junji
    MULTIMODAL TECHNOLOGIES AND INTERACTION, 2019, 3 (04)
  • [6] Predicting Next Speaker and Timing from Gaze Transition Patterns in Multi-Party Meetings
    Ishii, Ryo
    Otsuka, Kazuhiro
    Kumano, Shiro
    Matsuda, Masafumi
    Yamato, Junji
    ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 79 - 86
  • [7] Speakership, recipiency and the interactional space: Cases of "Next-speaker self-selects" in multiparty university student meetings
    Chen, Qi
    Brandt, Adam
    JOURNAL OF PRAGMATICS, 2021, 180 : 54 - 71
  • [8] ANALYSIS AND MODELING OF NEXT SPEAKING START TIMING BASED ON GAZE BEHAVIOR IN MULTI-PARTY MEETINGS
    Ishii, Ryo
    Otsuka, Kazuhiro
    Kumano, Shiro
    Yamato, Junji
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [9] Gaze behavior when approaching an intersection: Dwell time distribution and comparison with a quantitative prediction
    Lemonnier, Sophie
    Bremond, Roland
    Baccino, Thierry
    TRANSPORTATION RESEARCH PART F-TRAFFIC PSYCHOLOGY AND BEHAVIOUR, 2015, 35 : 60 - 74
  • [10] Prediction of Next-Utterance Timing using Head Movement in Multi-Party Meetings
    Ishii, Ryo
    Kumano, Shiro
    Otsuka, Kazuhiro
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON HUMAN AGENT INTERACTION (HAI'17), 2017, : 181 - 187