Boosting Cross-media Retrieval via Visual-Auditory Feature Analysis and Relevance Feedback

被引:8
|
作者
Zhang, Hong [1 ,2 ,3 ]
Yuan, Junsong [3 ]
Gao, Xingyu [4 ,5 ]
Chen, Zhenyu [4 ]
机构
[1] Wuhan Univ Sci & Technol, Coll Comp Sci & Technol, Wuhan, Hubei, Peoples R China
[2] Hubei Prov Key Lab Intelligent Informat Proc & Re, Wuhan, Hubei, Peoples R China
[3] Nanyang Technol Univ, Sch EEE, Singapore, Singapore
[4] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
[5] Nanyang Technol Univ, Sch Comp Engn, Singapore, Singapore
基金
中国国家自然科学基金;
关键词
Cross-media retrieval; feature analysis; relevance feedback;
D O I
10.1145/2647868.2654975
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Different types of multimedia data express high-level semantics from different aspects. How to learn comprehensive high-level semantics from different types of data and enable efficient cross-media retrieval becomes an emerging hot issue. There are abundant correlations among heterogeneous low-level media content, which makes it challenging to query cross-media data effectively. In this paper, we propose a new cross-media retrieval method based on short-term and long-term relevance feedback. Our method mainly focuses on two typical types of media data, i.e. image and audio. First, we build multimodal representation via statistical canonical correlation between image and audio feature matrices, and define cross-media distance metric for similarity measure; then we propose optimization strategy based on relevance feedback, which fuses short-term learning results and long-term accumulated knowledge into the objective function. Experiments on image-audio dataset have demonstrated the superiority of our method over several existing algorithms.
引用
收藏
页码:953 / 956
页数:4
相关论文
共 40 条
  • [2] Understanding visual-auditory correlation from heterogeneous features for cross-media retrieval
    Zhang, Hong
    Wang, Yan-yun
    Pan, Hong
    Wu, Fei
    [J]. JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE A, 2008, 9 (02): : 241 - 249
  • [3] Understanding visual-auditory correlation from heterogeneous features for cross-media retrieval
    Hong Zhang
    Yan-yun Wang
    Hong Pan
    Fei Wu
    [J]. Journal of Zhejiang University SCIENCE A, 2008, 9 : 241 - 249
  • [4] Nonnegative cross-media recoding of visual-auditory content for social media analysis
    Hong Zhang
    Xin Xu
    [J]. Multimedia Tools and Applications, 2015, 74 : 577 - 593
  • [5] Nonnegative cross-media recoding of visual-auditory content for social media analysis
    Zhang, Hong
    Xu, Xin
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (02) : 577 - 593
  • [6] Bridging the gap between visual and auditory feature spaces for cross-media retrieval
    Hong Zhang
    Fei Wu
    [J]. ADVANCES IN MULTIMEDIA MODELING, PT 1, 2007, 4351 : 596 - 605
  • [7] Cross-media Relevance Computation for Multimedia Retrieval
    Dong, Jianfeng
    [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 831 - 835
  • [8] Image Retrieval by Cross-Media Relevance Fusion
    Dong, Jianfeng
    Li, Xirong
    Liao, Shuai
    Xu, Jieping
    Xu, Duanqing
    Du, Xiaoyong
    [J]. MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 173 - 176
  • [9] Boosting cross-media retrieval by learning with positive and negative examples
    Zhuang, Yueting
    Yang, Yi
    [J]. ADVANCES IN MULTIMEDIA MODELING, PT 2, 2007, 4352 : 165 - +
  • [10] CSRNCVA: A MODEL OF CROSS-MEDIA SEMANTIC RETRIEVAL BASED ON NEURAL COMPUTING OF VISUAL AND AUDITORY SENSATIONS
    Liu, Y.
    Cai, K.
    Liu, C.
    Zheng, F.
    [J]. NEURAL NETWORK WORLD, 2018, 28 (04) : 305 - 323