Learning cross-modal appearance models with application to tracking

被引：0

作者：

Fisher, JW ^{[1
]}

Darrell, T ^{[1
]}

机构：

[1] MIT, Artificial Intelligence Lab, Cambridge, MA 02139 USA

来源：

2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL II, PROCEEDINGS | 2003年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Objects of interest are rarely silent or invisible. Analysis of multimodal signal generation from a single object represents a rich and challenging area for smart sensor arrays. We consider the problem of simultaneously learning and audio and visual appearance model of a moving subject. We present a method which successfully learns such a model without benefit of hand initialization using only the associated audio signal to "decide" which object to model and track. We are interested in particular in modeling joint audio and video variation, such as produced by a speaking face. We present an algorithm and experimental results of a human speaker moving in a scene.

引用

页码：13 / 16

页数：4

共 50 条

[11] Cross-Modal Concept Learning and Inference for Vision-Language Models
Zhang, Yi
Zhang, Ce
Tang, Yushun
He, Zhihai
NEUROCOMPUTING, 2024, 583
[12] Cross-Modal Learning with Adversarial Samples
Li, Chao
Deng, Cheng
Gao, Shangqian
Xie, De
Liu, Wei
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[13] Auditory and cross-modal implicit learning
Green, CD
Groff, P
INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1996, 31 (3-4) : 15442 - 15442
[14] Continual learning in cross-modal retrieval
Wang, Kai
Herranz, Luis
van de Weijer, Joost
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3623 - 3633
[15] Cross-Modal Discrete Representation Learning
Liu, Alexander H.
Jin, SouYoung
Lai, Cheng-I Jeff
Rouditchenko, Andrew
Oliva, Aude
Glass, James
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 3013 - 3035
[16] Learning DALTS for cross-modal retrieval
Yu, Zheng
Wang, Wenmin
CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2019, 4 (01) : 9 - 16
[17] Sequential Learning for Cross-modal Retrieval
Song, Ge
Tan, Xiaoyang
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4531 - 4539
[18] Cross-modal Target Retrieval for Tracking by Natural Language
Li, Yihao
Yu, Jun
Cai, Zhongpeng
Pan, Yuwen
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4927 - 4936
[19] Unsupervised Cross-Modal Distillation for Thermal Infrared Tracking
Sun, Jingxian
Zhang, Lichao
Zha, Yufei
Gonzalez-Garcia, Abel
Zhang, Peng
Huang, Wei
Zhang, Yanning
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2262 - 2270
[20] Prototype-based cross-modal object tracking
Liu, Lei
Li, Chenglong
Wang, Futian
Shen, Longfeng
Tang, Jin
INFORMATION FUSION, 2025, 118

← 1 2 3 4 5 →