Center-enhanced video captioning model with multimodal semantic alignment

被引:0
|
作者
Zhang, Benhui [1 ,2 ]
Gao, Junyu [2 ,3 ]
Yuan, Yuan [2 ]
机构
[1] School of Computer Science, Northwestern Polytechnical University, Xi'an,710072, China
[2] School of Artificial Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical University, Xi'an,710072, China
[3] Shanghai Artificial Intelligence Laboratory, Shanghai,200232, China
关键词
Compendex;
D O I
10.1016/j.neunet.2024.106744
中图分类号
学科分类号
摘要
Video analysis
引用
收藏
相关论文
共 50 条
  • [1] MULTIMODAL SEMANTIC ATTENTION NETWORK FOR VIDEO CAPTIONING
    Sun, Liang
    Li, Bing
    Yuan, Chunfeng
    Zha, Zhengjun
    Hu, Weiming
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1300 - 1305
  • [2] Learning Semantic Concepts and Temporal Alignment for Narrated Video Procedural Captioning
    Shi, Botian
    Ji, Lei
    Niu, Zhendong
    Duan, Nan
    Zhou, Ming
    Chen, Xilin
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4337 - 4345
  • [3] Multimodal-enhanced hierarchical attention network for video captioning
    Maosheng Zhong
    Youde Chen
    Hao Zhang
    Hao Xiong
    Zhixiang Wang
    [J]. Multimedia Systems, 2023, 29 : 2469 - 2482
  • [4] Multimodal-enhanced hierarchical attention network for video captioning
    Zhong, Maosheng
    Chen, Youde
    Zhang, Hao
    Xiong, Hao
    Wang, Zhixiang
    [J]. MULTIMEDIA SYSTEMS, 2023, 29 (05) : 2469 - 2482
  • [5] Bilingual video captioning model for enhanced video retrieval
    Norah Alrebdi
    Amal A. Al-Shargabi
    [J]. Journal of Big Data, 11
  • [6] Bilingual video captioning model for enhanced video retrieval
    Alrebdi, Norah
    Al-Shargabi, Amal A.
    [J]. JOURNAL OF BIG DATA, 2024, 11 (01)
  • [7] Semantic Enhanced Video Captioning with Multi-feature Fusion
    Niu, Tian-Zi
    Dong, Shan-Shan
    Chen, Zhen-Duo
    Luo, Xin
    Guo, Shanqing
    Huang, Zi
    Xu, Xin-Shun
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (06)
  • [8] Multirate Multimodal Video Captioning
    Yang, Ziwei
    Xu, Youjiang
    Wang, Huiyun
    Wang, Bo
    Han, Yahong
    [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1877 - 1882
  • [9] Semantic Tag Augmented XlanV Model for Video Captioning
    Huang, Yiqing
    Xue, Hongwei
    Chen, Jiansheng
    Ma, Huimin
    Ma, Hongbing
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4818 - 4822
  • [10] MIVCN: Multimodal interaction video captioning network based on semantic association graph
    Wang, Ying
    Huang, Guoheng
    Lin Yuming
    Yuan, Haoliang
    Pun, Chi-Man
    Ling, Wing-Kuen
    Cheng, Lianglun
    [J]. APPLIED INTELLIGENCE, 2022, 52 (05) : 5241 - 5260