Deciphering the Silent Participant On the Use of Audio-Visual Cues for the Classification of Listener Categories in Group Discussions

被引:11
|
作者
Oertel, Catharine [1 ]
Mora, Kenneth A. Funes [2 ]
Gustafson, Joakim [1 ]
Odobez, Jean-Marc [2 ]
机构
[1] KTH Royal Inst Technol, Linstedtsvagen 44, Stockholm, Sweden
[2] Ecole Polytech Fed Lausanne, Idiap Res Inst, CH-1015 Lausanne, Switzerland
关键词
listener categories; non-verbal cues; eye-gaze;
D O I
10.1145/2818346.2820759
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Estimating a silent participant's degree of engagement and his role within a group discussion can be challenging, as there are no speech related cues available at the given time. Having this information available, however, can provide important insights into the dynamics of the group as a whole. In this paper, we study the classification of listeners into several categories (attentive listener, side participant and bystander). We devised a thin-sliced perception test where subjects were asked to assess listener roles and engagement levels in 15-second video-clips taken from a corpus of group interviews. Results show that humans are usually able to assess silent participant roles. Using the annotation to identify from a set of multimodal low-level features, such as past speaking activity, backchannels (both visual and verbal), as well as gaze patterns, we could identify the features which are able to distinguish between different listener categories. Moreover, the results show that many of the audiovisual effects observed on listeners in dyadic interactions, also hold for multi-party interactions. A preliminary classifier achieves an accuracy of 64%.
引用
收藏
页码:107 / 114
页数:8
相关论文
共 3 条
  • [1] Vehicle Detection and Classification using Audio-Visual cues
    Piyush, P.
    Rajan, Rajeev
    Mary, Leena
    Koshy, Bino I.
    2016 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2016, : 732 - 736
  • [2] HAVE-Net: Hallucinated Audio-Visual Embeddings for Few-Shot Classification with Unimodal Cues
    Jha, Ankit
    Pal, Debabrata
    Singha, Mainak
    Agarwal, Naman
    Banerjee, Biplab
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2023, PT IV, 2025, 2136 : 390 - 398
  • [3] Brain Connectivity Features-based Age Group Classification using Temporal Asynchrony Audio-Visual Integration Task
    Singh, Prerna
    Tripathi, Ayush
    Kumar, Lalan
    Gandhi, Tapan Kumar
    2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,