Unsupervised segmentation of meeting configurations and activities using speech activity detection

被引:0
|
作者
Brdiczka, Oliver [1 ]
Vaufreydaz, Dominique [1 ]
Maisonnasse, Jerome [1 ]
Reignier, Patrick [1 ]
机构
[1] INRIA Rhone Alpes, 655 Av Europe, F-38330 Montbonnot St Martin, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the problem of segmenting small group meetings in order to detect different group configurations and activities in an intelligent environment. Our approach takes speech activity detection of individuals attending a meeting as input. The goal is to separate distinct distributions of speech activity observation corresponding to distinct group configurations and activities. We propose an unsupervised method based on the calculation of the Jeffrey divergence between histograms of speech activity observations. These histograms are generated from adjacent windows of variable size slid from the beginning to the end of a meeting recording. The peaks of the resulting Jeffrey divergence curves are detected using successive robust mean estimation. After a merging and filtering process, the retained peaks are used to select the best model, i.e. the best speech activity distribution allocation for a given meeting recording. These distinct distributions can be interpreted as distinct segments of group configuration and activity. To evaluate, we recorded 6 small group meetings. We measured the correspondence between detected segments and labeled group configurations and activities. The obtained results are promising, in particular as our method is completely unsupervised.
引用
收藏
页码:195 / +
页数:2
相关论文
共 50 条
  • [21] Unsupervised Detection and Segmentation of Identical Objects
    Cho, Minsu
    Shin, Young Min
    Lee, Kyoung Mu
    [J]. 2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 1617 - 1624
  • [22] Unsupervised distributional anomaly detection for a self-diagnostic speech activity detector
    Borges, Nash
    Meyer, Gerard G. L.
    [J]. 2008 42ND ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS, VOLS 1-3, 2008, : 950 - 955
  • [23] Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challenge 2021
    Gimeno, Pablo
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    [J]. INTERSPEECH 2021, 2021, : 4359 - 4363
  • [24] Unsupervised novelty detection using Gabor filters for defect segmentation in textures
    Rallo, Miquel
    Millan, Maria S.
    Escofet, Jaume
    [J]. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2009, 26 (09) : 1967 - 1976
  • [25] Varying microphone patterns for meeting speech segmentation using spatial audio cues
    Cheng, Eva
    Burnett, Ian
    Ritz, Christian
    [J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2006, PROCEEDINGS, 2006, 4261 : 221 - +
  • [26] Unsupervised Learning and Segmentation of Complex Activities from Video
    Sener, Fadime
    Yao, Angela
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8368 - 8376
  • [27] HAND GRAPH REPRESENTATIONS FOR UNSUPERVISED SEGMENTATION OF COMPLEX ACTIVITIES
    Das, Pratyusha
    Kao, Jiun-Yu
    Ortega, Antonio
    Sawada, Tomoya
    Mansour, Hassan
    Vetro, Anthony
    Minezawa, Akira
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 4075 - 4079
  • [28] Whisper-Island Detection Based on Unsupervised Segmentation With Entropy-Based Speech Feature Processing
    Zhang, Chi
    Hansen, John H. L.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 883 - 894
  • [29] Unsupervised detection of botnet activities using frequent pattern tree mining
    Hao, Siqiang
    Liu, Di
    Baldi, Simone
    Yu, Wenwu
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (02) : 761 - 769
  • [30] Unsupervised detection of botnet activities using frequent pattern tree mining
    Siqiang Hao
    Di Liu
    Simone Baldi
    Wenwu Yu
    [J]. Complex & Intelligent Systems, 2022, 8 : 761 - 769