Based on Spatial and Temporal Implicit Semantic Relational Inference for Cross-Modal Retrieval

被引:0
|
作者
Jin M. [1 ]
Hu W. [1 ]
Zhu L. [2 ]
Wang X. [3 ]
Hong R. [1 ]
机构
[1] School of Computer and Information, Hefei University of Technology, Hefei
[2] School of Electronic and Information Engineering, Tongji University, Shanghai
[3] School of Data Science, University of Science and Technology of China, Hefei
关键词
Computational modeling; cross-modal retrieval; Data models; Feature extraction; semantic alignment; semantic mining; Semantics; Task analysis; temporal space inference; Training; Visualization;
D O I
10.1109/TCSVT.2024.3411298
中图分类号
学科分类号
摘要
To meet users’ demands for video retrieval, text-video cross-modal retrieval technology continues to evolve. Methods based on pre-trained models and transfer learning are widely employed in designing cross-modal retrieval models, significantly enhancing the accuracy of video retrieval. However, these methods exhibit shortcomings when it comes to studying the relationships between video frames, preventing the model from fully establishing the hidden semantic relationships within video features. To further deduce the implicit semantic relationships among video frames, we propose a cross-modal retrieval model based on graph convolutional networks (GCN) and visual semantic inference (GVSI). The GCN is utilized to establish relationships between video frame features, facilitating the mining of hidden semantic information across video frames. In order to use text semantic features to help the model to infer temporal and implicit semantic information between video frames, we introduce a semantic mining and temporal space (SM&TS) inference module. Additionally, we design semantic alignment modules (SA_M) to align explicit and implicit object features present in both video and text. Finally, we analyze and validate the effectiveness of the model using MSR-VTT, MSVD, and LSMDC datasets. IEEE
引用
收藏
页码:1 / 1
相关论文
共 50 条
  • [41] Semantic preserving asymmetric discrete hashing for cross-modal retrieval
    Yang, Fan
    Zhang, Qiao-xi
    Ding, Xiao-jian
    Ma, Fu-min
    Cao, Jie
    Tong, De-yu
    APPLIED INTELLIGENCE, 2023, 53 (12) : 15352 - 15371
  • [42] Discrete semantic embedding hashing for scalable cross-modal retrieval
    Liu, Junjie
    Fei, Lunke
    Jia, Wei
    Zhao, Shuping
    Wen, Jie
    Teng, Shaohua
    Zhang, Wei
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 1461 - 1467
  • [43] Deep supervised multimodal semantic autoencoder for cross-modal retrieval
    Tian, Yu
    Yang, Wenjing
    Liu, Qingsong
    Yang, Qiong
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2020, 31 (4-5)
  • [44] ONION: Online Semantic Autoencoder Hashing for Cross-Modal Retrieval
    Zhang, Donglin
    Wu, Xiao-Jun
    Chen, Guoqing
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (02)
  • [45] Hierarchical Semantic Structure Preserving Hashing for Cross-Modal Retrieval
    Wang, Di
    Zhang, Caiping
    Wang, Quan
    Tian, Yumin
    He, Lihuo
    Zhao, Lin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1217 - 1229
  • [46] Deep semantic hashing with dual attention for cross-modal retrieval
    Wu, Jiagao
    Weng, Weiwei
    Fu, Junxia
    Liu, Linfeng
    Hu, Bin
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (07): : 5397 - 5416
  • [47] Deep Semantic Correlation with Adversarial Learning for Cross-Modal Retrieval
    Hua, Yan
    Du, Jianhe
    PROCEEDINGS OF 2019 IEEE 9TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC 2019), 2019, : 252 - 255
  • [48] Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
    Song, Yale
    Soleymani, Mohammad
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1979 - 1988
  • [49] Semantic Constraints Matrix Factorization Hashing for cross-modal retrieval
    Li, Weian
    Xiong, Haixia
    Ou, Weihua
    Gou, Jianping
    Deng, Jiaxing
    Liang, Linqing
    Zhou, Quan
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 100
  • [50] Discrete Semantic Matrix Factorization Hashing for Cross-Modal Retrieval
    Qin, Jianyang
    Fei, Lunke
    Teng, Shaohua
    Zhang, Wei
    Liu, Dongning
    Zhao, Genping
    Yuan, Haoliang
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1550 - 1557