Content-oriented multimedia document understanding through cross-media correlation

被引:0
|
作者
Tong Lu
Yukang Jin
Feng Su
Palaiahnakote Shivakumara
Chew Lim Tan
机构
[1] Nanjing University,National Key Laboratory for Novel Software Technology, Department of Computer Science and Technology
[2] University of Malaya,Faculty of Computer Science and Information Technology
[3] National University of Singapore,School of Computing
来源
关键词
Multimedia documents; Multimodal; MAD; MCN; Correlation propagation;
D O I
暂无
中图分类号
学科分类号
摘要
This paper presents a novel method for multimedia document content analysis through modeling multimodal data correlations. We hypothesize that the correlation of different modalities from the same data source can help achieve better multimedia content understanding results than one which explores a single modality. We turn this task into two parts: multimedia data fusion and multimodal correlation propagation. During the first stage, we re-organize the training multimedia data into Modality semAntic Documents (MADs) after extracting quantized multimodal features, and then use multivariate Gaussian distributions to characterize the continuous quantity by latent topic modeling. Model parameters are asymmetrically learned to initialize multimodal correlations in the latent topic space. Accordingly, during the second stage, we construct a Multimodal Correlation Network (MCN) based on the initialized multimodal correlations, and a new mechanism of propagating inter-modality correlations and intra-modality similarities in MCN is further proposed to take the complementary from cross-modalities to facilitate multimedia content analysis. The experimental results of image-audio data retrieval on a 10-categories dataset and content-oriented web page recommendation on the USTODAY dataset show the effectiveness of our method.
引用
下载
收藏
页码:8105 / 8135
页数:30
相关论文
共 50 条
  • [1] Content-oriented multimedia document understanding through cross-media correlation
    Lu, Tong
    Jin, Yukang
    Su, Feng
    Shivakumara, Palaiahnakote
    Tan, Chew Lim
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (18) : 8105 - 8135
  • [2] Understanding multimedia document semantics for cross-media retrieval
    Wu, F
    Yang, Y
    Zhuang, YT
    Pan, YH
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2005, PT 1, 2005, 3767 : 993 - 1004
  • [3] Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval
    Yang, Yi
    Zhuang, Yue-Ting
    Wu, Fei
    Pan, Yun-He
    IEEE TRANSACTIONS ON MULTIMEDIA, 2008, 10 (03) : 437 - 446
  • [4] Towards content-oriented patent document processing
    Wanner, Leo
    Baeza-Yates, Ricardo
    Brugmann, Soren
    Codina, Joan
    Diallo, Barrou
    Escorsa, Enric
    Giereth, Mark
    Kompatsiaris, Yiannis
    Papadopoulos, Symeon
    Pianta, Emanuele
    Piella, Gemma
    Puhlmann, Ingo
    Rao, Gautam
    Rotard, Martin
    Schoester, Pia
    Serafini, Luciano
    Zervaki, Vasiliki
    WORLD PATENT INFORMATION, 2008, 30 (01) : 21 - 33
  • [5] Cross-media retrieval method based on content correlation
    Zhang, Hong
    Wu, Fei
    Zhuang, Yue-Ting
    Chen, Jian-Xun
    Jisuanji Xuebao/Chinese Journal of Computers, 2008, 31 (05): : 820 - 826
  • [6] Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval
    Zhuang, Yue-Ting
    Yang, Yi
    Wu, Fei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2008, 10 (02) : 221 - 229
  • [7] Cross-Media Document Linking and Navigation
    Tayeh, Ahmed A. O.
    Ebrahimi, Payam
    Signer, Beat
    PROCEEDINGS OF THE ACM SYMPOSIUM ON DOCUMENT ENGINEERING (DOCENG 2018), 2018,
  • [8] Cross-media Relevance Computation for Multimedia Retrieval
    Dong, Jianfeng
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 831 - 835
  • [9] Multimedia - Know the variables in a cross-media transfer
    Phillips, JB
    Hindawi, MA
    Phillips, A
    Bailey, RV
    POLLUTION ENGINEERING, 1999, 31 (12) : 37 - 38
  • [10] A cross-media adaptation strategy for multimedia presentations
    Boll, S
    Klas, W
    Wandel, J
    ACM MULTIMEDIA 99, PROCEEDINGS, 1999, : 37 - 46