Prediction of Who Will Be the Next Speaker and When Using Gaze Behavior in Multiparty Meetings

被引：40

作者：

Ishii, Ryo ^{[1
,2
]}

Otsuka, Kazuhiro ^{[1
,2
]}

Kumano, Shiro ^{[1
,2
]}

Yamato, Junji ^{[1
,2
]}

机构：

[1] NTT Corp, Tokyo, Japan

[2] 3-1 Morinosato Wakamiya, Atsugi, Kanagawa 2430198, Japan

来源：

ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS | 2016年 / 6卷 / 01期

关键词：

Turn-changing; multiparty meetings; gaze behavior; next speaker prediction; speech timing prediction;

D O I：

10.1145/2757284

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In multiparty meetings, participants need to predict the end of the speaker's utterance and who will start speaking next, as well as consider a strategy for good timing to speak next. Gaze behavior plays an important role in smooth turn-changing. This article proposes a prediction model that features three processing steps to predict (I) whether turn-changing or turn-keeping will occur, (II) who will be the next speaker in turn-changing, and (III) the timing of the start of the next speaker's utterance. For the feature values of the model, we focused on gaze transition patterns and the timing structure of eye contact between a speaker and a listener near the end of the speaker's utterance. Gaze transition patterns provide information about the order in which gaze behavior changes. The timing structure of eye contact is defined as who looks at whom and who looks away first, the speaker or listener, when eye contact between the speaker and a listener occurs. We collected corpus data of multiparty meetings, using the data to demonstrate relationships between gaze transition patterns and timing structure and situations (I), (II), and (III). The results of our analyses indicate that the gaze transition pattern of the speaker and listener and the timing structure of eye contact have a strong association with turn-changing, the next speaker in turn-changing, and the start time of the next utterance. On the basis of the results, we constructed prediction models using the gaze transition patterns and timing structure. The gaze transition patterns were found to be useful in predicting turn-changing, the next speaker in turn-changing, and the start time of the next utterance. Contrary to expectations, we did not find that the timing structure is useful for predicting the next speaker and the start time. This study opens up new possibilities for predicting the next speaker and the timing of the next utterance using gaze transition patterns in multiparty meetings.

引用

页数：31

共 18 条

[1] Using Respiration to Predict Who Will Speak Next and When in Multiparty Meetings
Ishii, Ryo
Otsuka, Kazuhiro
Kumano, Shiro
Yamato, Junji
ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2016, 6 (02)
[2] Who Speaks Next? Turn Change and Next Speaker Prediction in Multimodal Multiparty Interaction
Malik, Usman
Saunier, Julien
Funakoshi, Kotaro
Pauchet, Alexandre
2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 349 - 354
[3] Predicting Who Will Be the Next Speaker and When in Multi-party Meetings
Ishii, Ryo
Otsuka, Kazuhiro
Kumano, Shiro
Yamato, Junji
NTT Technical Review, 2015, 13 (07):
[4] Multimodal Fusion using Respiration and Gaze for Predicting Next Speaker in Multi-Party Meetings
Ishii, Ryo
Kumano, Shiro
Otsuka, Kazuhiro
ICMI'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2015, : 99 - 106
[5] Prediction of Who Will Be Next Speaker and When Using Mouth-Opening Pattern in Multi-Party Conversation
Ishii, Ryo
Otsuka, Kazuhiro
Kumano, Shiro
Higashinaka, Ryuichiro
Tomita, Junji
MULTIMODAL TECHNOLOGIES AND INTERACTION, 2019, 3 (04)
[6] Predicting Next Speaker and Timing from Gaze Transition Patterns in Multi-Party Meetings
Ishii, Ryo
Otsuka, Kazuhiro
Kumano, Shiro
Matsuda, Masafumi
Yamato, Junji
ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 79 - 86
[7] Speakership, recipiency and the interactional space: Cases of "Next-speaker self-selects" in multiparty university student meetings
Chen, Qi
Brandt, Adam
JOURNAL OF PRAGMATICS, 2021, 180 : 54 - 71
[8] ANALYSIS AND MODELING OF NEXT SPEAKING START TIMING BASED ON GAZE BEHAVIOR IN MULTI-PARTY MEETINGS
Ishii, Ryo
Otsuka, Kazuhiro
Kumano, Shiro
Yamato, Junji
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[9] Gaze behavior when approaching an intersection: Dwell time distribution and comparison with a quantitative prediction
Lemonnier, Sophie
Bremond, Roland
Baccino, Thierry
TRANSPORTATION RESEARCH PART F-TRAFFIC PSYCHOLOGY AND BEHAVIOUR, 2015, 35 : 60 - 74
[10] Prediction of Next-Utterance Timing using Head Movement in Multi-Party Meetings
Ishii, Ryo
Kumano, Shiro
Otsuka, Kazuhiro
PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON HUMAN AGENT INTERACTION (HAI'17), 2017, : 181 - 187

← 1 2 →