Learning cross-modal appearance models with application to tracking

被引:0
|
作者
Fisher, JW [1 ]
Darrell, T [1 ]
机构
[1] MIT, Artificial Intelligence Lab, Cambridge, MA 02139 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objects of interest are rarely silent or invisible. Analysis of multimodal signal generation from a single object represents a rich and challenging area for smart sensor arrays. We consider the problem of simultaneously learning and audio and visual appearance model of a moving subject. We present a method which successfully learns such a model without benefit of hand initialization using only the associated audio signal to "decide" which object to model and track. We are interested in particular in modeling joint audio and video variation, such as produced by a speaking face. We present an algorithm and experimental results of a human speaker moving in a scene.
引用
收藏
页码:13 / 16
页数:4
相关论文
共 50 条
  • [1] Learning cross-modal interaction for RGB-T tracking
    Xu, Chunyan
    Cui, Zhen
    Wang, Chaoqun
    Zhou, Chuanwei
    Yang, Jian
    SCIENCE CHINA-INFORMATION SCIENCES, 2023, 66 (01)
  • [2] Learning cross-modal interaction for RGB-T tracking
    Chunyan XU
    Zhen CUI
    Chaoqun WANG
    Chuanwei ZHOU
    Jian YANG
    Science China(Information Sciences), 2023, 66 (01) : 320 - 321
  • [3] Learning cross-modal interaction for RGB-T tracking
    Chunyan Xu
    Zhen Cui
    Chaoqun Wang
    Chuanwei Zhou
    Jian Yang
    Science China Information Sciences, 2023, 66
  • [4] Cross-modal application of a neuromorphic olfactory learning algorithm
    Helde, Michael
    Dimitrov, Alexander
    JOURNAL OF COMPUTATIONAL NEUROSCIENCE, 2024, 52 : S40 - S40
  • [5] Cross-modal application of a neuromorphic olfactory learning algorithm
    Helde, Michael
    Dimitrov, Alexander
    JOURNAL OF COMPUTATIONAL NEUROSCIENCE, 2024, 52 : S40 - S40
  • [6] HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal Retrieval
    Zhang, Chengyuan
    Song, Jiayu
    Zhu, Xiaofeng
    Zhu, Lei
    Zhang, Shichao
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (01)
  • [7] Cross-modal visual and vibrotactile tracking
    van Erp, JBF
    Verschoor, MH
    APPLIED ERGONOMICS, 2004, 35 (02) : 105 - 112
  • [8] Infant cross-modal learning
    Chow, Hiu Mei
    Tsui, Angeline Sin-Mei
    Ma, Yuen Ki
    Yat, Mei Ying
    Tseng, Chia-huei
    I-PERCEPTION, 2014, 5 (04): : 463 - 463
  • [9] Cross-Modal Conceptualization in Bottleneck Models
    Alukaev, Danis
    Kiselev, Semen
    Pershin, Ilya
    Ibragimov, Bulat
    Ivanov, Vladimir
    Kornaev, Alexey
    Titov, Ivan
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 5241 - 5253
  • [10] CoCM: Conditional Cross-Modal Learning for Vision-Language Models
    Yang, Juncheng
    Xie, Shuai
    Li, Shuxia
    Cai, Zengyu
    Li, Yijia
    Zhu, Weiping
    ELECTRONICS, 2025, 14 (01):