Understanding visual-auditory correlation from heterogeneous features for cross-media retrieval

被引:2
|
作者
Zhang, Hong [1 ,2 ]
Wang, Yan-yun [3 ]
Pan, Hong [4 ]
Wu, Fei [2 ]
机构
[1] Wuhan Univ Sci & Technol, Coll Comp Sci & Technol, Wuhan 430081, Peoples R China
[2] Zhejiang Univ, Sch Comp Sci & Technol, Hangzhou 310027, Peoples R China
[3] Hangzhou Normal Univ, Sch Elementary Educ, Hangzhou 310036, Peoples R China
[4] Hangzhou Normal Univ, Sch Informat Engn, Hangzhou 310036, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
heterogeneity; cross-media retrieval; subspace optimization; dynamic correlation update;
D O I
10.1631/jzus.A071191
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Cross-media retrieval is an interesting research topic, which seeks to remove the barriers among different modalities. To enable cross-media retrieval, it is needed to find the correlation measures between heterogeneous low-level features and to judge the semantic similarity. This paper presents a novel approach to learn cross-media correlation between visual features and auditory features for image-audio retrieval. A semi-supervised correlation preserving mapping (SSCPM) method is described to construct the isomorphic SSCPM subspace where canonical correlations between the original visual and auditory features are further preserved. Subspace optimization algorithm is proposed to improve the local image cluster and audio cluster quality in an interactive way. A unique relevance feedback strategy is developed to update the knowledge of cross-media correlation by learning from user behaviors, so retrieval performance is enhanced in a progressive manner. Experimental results show that the performance of our approach is effective.
引用
收藏
页码:241 / 249
页数:9
相关论文
共 50 条
  • [2] Understanding visual-auditory correlation from heterogeneous features for cross-media retrieval
    Hong Zhang
    Yan-yun Wang
    Hong Pan
    Fei Wu
    [J]. Journal of Zhejiang University SCIENCE A, 2008, 9 : 241 - 249
  • [3] Boosting Cross-media Retrieval via Visual-Auditory Feature Analysis and Relevance Feedback
    Zhang, Hong
    Yuan, Junsong
    Gao, Xingyu
    Chen, Zhenyu
    [J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 953 - 956
  • [4] Nonnegative cross-media recoding of visual-auditory content for social media analysis
    Hong Zhang
    Xin Xu
    [J]. Multimedia Tools and Applications, 2015, 74 : 577 - 593
  • [5] Nonnegative cross-media recoding of visual-auditory content for social media analysis
    Zhang, Hong
    Xu, Xin
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (02) : 577 - 593
  • [6] Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval
    Zhuang, Yue-Ting
    Yang, Yi
    Wu, Fei
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2008, 10 (02) : 221 - 229
  • [7] Bridging the gap between visual and auditory feature spaces for cross-media retrieval
    Hong Zhang
    Fei Wu
    [J]. ADVANCES IN MULTIMEDIA MODELING, PT 1, 2007, 4351 : 596 - 605
  • [8] Structural Fusion of Heterogeneous Visual-Auditory Features for Multimedia Analysis
    Zhang, Hong
    Nie, Jiamei
    Chen, Li
    [J]. 2013 10TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2013, : 821 - 825
  • [9] An Approach for Mining Heterogeneous Data for Cross-Media Retrieval
    Pavan, K. Madhu
    Ananthanarayana, V. S.
    [J]. 2013 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND NETWORKING TECHNOLOGIES (ICCCNT), 2013,
  • [10] CROSS-MODALITY CORRELATION PROPAGATION FOR CROSS-MEDIA RETRIEVAL
    Zhai, Xiaohua
    Peng, Yuxin
    Xiao, Jianguo
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 2337 - 2340