Learning cross-modal appearance models with application to tracking

被引:0
|
作者
Fisher, JW [1 ]
Darrell, T [1 ]
机构
[1] MIT, Artificial Intelligence Lab, Cambridge, MA 02139 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objects of interest are rarely silent or invisible. Analysis of multimodal signal generation from a single object represents a rich and challenging area for smart sensor arrays. We consider the problem of simultaneously learning and audio and visual appearance model of a moving subject. We present a method which successfully learns such a model without benefit of hand initialization using only the associated audio signal to "decide" which object to model and track. We are interested in particular in modeling joint audio and video variation, such as produced by a speaking face. We present an algorithm and experimental results of a human speaker moving in a scene.
引用
收藏
页码:13 / 16
页数:4
相关论文
共 50 条
  • [41] Cross-Modal Metric Learning for AUC Optimization
    Huo, Jing
    Gao, Yang
    Shi, Yinghuan
    Yin, Hujun
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (10) : 4844 - 4856
  • [42] Cross-modal generative models for multi-modal plastic sorting
    Neo, Edward R. K.
    Low, Jonathan S. C.
    Goodship, Vannessa
    Coles, Stuart R.
    Debattista, Kurt
    JOURNAL OF CLEANER PRODUCTION, 2023, 415
  • [43] Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models
    Lin, Zhiqiu
    Yu, Samuel
    Kuang, Zhiyi
    Pathak, Deepak
    Ramanan, Deva
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19325 - 19337
  • [44] Cross-modal contrastive learning with multi-hierarchical tracklet clustering for multi object tracking
    Hong, Ru
    Yang, Jiming
    Cai, Zeyu
    Da, Feipeng
    PATTERN RECOGNITION LETTERS, 2025, 191 : 1 - 7
  • [45] Cross-modal object tracking algorithm based on pedestrian attribute
    Zhou Q.
    Zhang W.
    Zhao L.
    Tian N.
    Wang R.
    Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2020, 46 (09): : 1635 - 1642
  • [46] Semi-supervised cross-modal learning for cross modal retrieval and image annotation
    Fuhao Zou
    Xingqiang Bai
    Chaoyang Luan
    Kai Li
    Yunfei Wang
    Hefei Ling
    World Wide Web, 2019, 22 : 825 - 841
  • [47] Efficient thermal infrared tracking with cross-modal compress distillation
    Li, Hangfei
    Zha, Yufei
    Li, Huanyu
    Zhang, Peng
    Huang, Wei
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 123
  • [48] Semi-supervised cross-modal learning for cross modal retrieval and image annotation
    Zou, Fuhao
    Bai, Xingqiang
    Luan, Chaoyang
    Li, Kai
    Wang, Yunfei
    Ling, Hefei
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (02): : 825 - 841
  • [49] Pretrained models for cross-modal retrieval: experiments and improvements
    Zhou, Kun
    Hassan, Fadratul Hafinaz
    Gan, Keng Hoon
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (05) : 4915 - 4923
  • [50] Category Alignment Adversarial Learning for Cross-Modal Retrieval
    He, Shiyuan
    Wang, Weiyang
    Wang, Zheng
    Xu, Xing
    Yang, Yang
    Wang, Xiaoming
    Shen, Heng Tao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 4527 - 4538