Conversation scene analysis with dynamic Bayesian network based on visual head tracking

被引:19
|
作者
Otsuka, Kazuhiro
Yamato, Junji
Takemae, Yoshinao [2 ]
Murase, Hiroshi [1 ]
机构
[1] Nagoya Univ, Nagoya, Aichi, Japan
[2] NTT Cyber Solut Labs, Nagoya, Aichi, Japan
关键词
D O I
10.1109/ICME.2006.262677
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A novel method based on a probabilistic model for conversation scene analysis is proposed that can infer conversation structure from video sequences of face-to-face communication. Conversation structure represents the type of conversation such as monologue or dialogue, and can indicate who is talking/listening to whom. This study assumes that the gaze directions of participants provide cues for discerning the conversation structure, and can be identified from head directions. For measuring head directions, the proposed method newly employs a visual head tracker based on Sparse-Template Condensation. The conversation model is built on a dynamic Bayesian network and is used to estimate the conversation structure and gaze directions from observed head directions and utterances. Visual tracking is conventionally thought to be less reliable than contact sensors, but experiments confirm that the proposed method achieves almost comparable performance in estimating gaze directions and conversation structure to a conventional sensor-based method.
引用
收藏
页码:949 / +
页数:2
相关论文
共 50 条
  • [1] A dynamic Bayesian network-based framework for visual tracking
    Kang, HB
    Cho, SH
    [J]. ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS, PROCEEDINGS, 2005, 3708 : 603 - 610
  • [2] A hierarchical dynamic Bayesian network approach to visual tracking
    Li, H
    Xiao, R
    Zhang, HJ
    Peng, LZ
    [J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2004, PT 2, PROCEEDINGS, 2004, 3332 : 617 - 624
  • [3] A dynamic Bayesian network approach to multi-cue based visual tracking
    Wang, T
    Diao, Q
    Zhang, YM
    Song, G
    Lai, CR
    Bradski, G
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, 2004, : 167 - 170
  • [4] A Method of Dynamic Visual Scene Analysis Based on Convolutional Neural Network
    Borisov, Vadim V.
    Garanin, Oleg I.
    [J]. ARTIFICIAL INTELLIGENCE (RCAI 2018), 2018, 934 : 60 - 69
  • [5] Weighted Bayesian Network for visual tracking
    Zhou, Yue
    Huang, Thomas S.
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2006, : 523 - +
  • [6] Audio-visual Technology for Conversation Scene Analysis
    Otsuka, Kazuhiro
    Araki, Shoko
    [J]. NTT Technical Review, 2009, 7 (02):
  • [7] Using dynamic Bayesian network for scene modeling and anomaly detection
    Junejo, Imran N.
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2010, 4 (01) : 1 - 10
  • [8] Using dynamic Bayesian network for scene modeling and anomaly detection
    Imran N. Junejo
    [J]. Signal, Image and Video Processing, 2010, 4 : 1 - 10
  • [9] A Dynamic Tracking Framework Based on Scene Perception
    Zhang, Jinpu
    Li, Ziwen
    Wang, Yuehuan
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XII, 2024, 14436 : 185 - 197
  • [10] Multitarget Tracking Based on Dynamic Bayesian Network With Reparameterized Approximate Variational Inference
    Zhang, Wenqiong
    Zhang, Jun
    Bao, Ming
    Zhang, Xiao-Ping
    Li, Xiaodong
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (13): : 11542 - 11559