Pseudo-labeling with keyword refining for few-supervised video captioning

被引:0
|
作者
Li, Ping [1 ]
Wang, Tao [1 ]
Zhao, Xinkui [2 ]
Xu, Xianghua [1 ]
Song, Mingli [3 ]
机构
[1] School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China
[2] School of Software Technology, Zhejiang University, Ningbo, China
[3] College of Computer Science, Zhejiang University, Hangzhou, China
基金
中国国家自然科学基金;
关键词
Semantics;
D O I
10.1016/j.patcog.2024.111176
中图分类号
学科分类号
摘要
Video captioning generate a sentence that describes the video content. Existing methods always require a number of captions (e.g., 10 or 20) per video to train the model, which is quite costly. In this work, we explore the possibility of using only one or very few ground-truth sentences, and introduce a new task named few-supervised video captioning. Specifically, we propose a few-supervised video captioning framework that consists of lexically constrained pseudo-labeling module and keyword-refined captioning module. Unlike the random sampling in natural language processing that may cause invalid modifications (i.e., edit words), the former module guides the model to edit words using some actions (e.g., copy, replace, insert, and delete) by a pretrained token-level classifier, and then fine-tunes candidate sentences by a pretrained language model. Meanwhile, the former employs the repetition penalized sampling to encourage the model to yield concise pseudo-labeled sentences with less repetition, and selects the most relevant sentences upon a pretrained video-text model. Moreover, to keep semantic consistency between pseudo-labeled sentences and video content, we develop the transformer-based keyword refiner with the video-keyword gated fusion strategy to emphasize more on relevant words. Extensive experiments on several benchmarks demonstrate the advantages of the proposed approach in both few-supervised and fully-supervised scenarios. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 50 条
  • [1] Multiview Pseudo-Labeling for Semi-supervised Learning from Video
    Xiong, Bo
    Fan, Haoqi
    Grauman, Kristen
    Feichtenhofer, Christoph
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7189 - 7199
  • [2] A pseudo-labeling based weakly supervised segmentation method for few-shot texture images
    Han, Yuexing
    Li, Ruiqi
    Wang, Bing
    Ruan, Liheng
    Chen, Qiaochuan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [3] Compressed video ensemble based pseudo-labeling for semi-supervised action recognition
    Terao, Hayato
    Noguchi, Wataru
    Iizuka, Hiroyuki
    Yamamoto, Masahito
    [J]. MACHINE LEARNING WITH APPLICATIONS, 2022, 9
  • [4] Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning
    Cascante-Bonilla, Paola
    Tan, Fuwen
    Qi, Yanjun
    Ordonez, Vicente
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 6912 - 6920
  • [5] Informative pseudo-labeling for graph neural networks with few labels
    Yayong Li
    Jie Yin
    Ling Chen
    [J]. Data Mining and Knowledge Discovery, 2023, 37 : 228 - 254
  • [6] Informative pseudo-labeling for graph neural networks with few labels
    Li, Yayong
    Yin, Jie
    Chen, Ling
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2023, 37 (01) : 228 - 254
  • [7] A Pseudo-labeling Approach to Semi-supervised Organ Segmentation
    Gao, Jianwei
    Xu, Juan
    Fei, Honggao
    [J]. FAST AND LOW-RESOURCE SEMI-SUPERVISED ABDOMINAL ORGAN SEGMENTATION, FLARE 2022, 2022, 13816 : 318 - 326
  • [8] Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition
    Higuchi, Yosuke
    Moritz, Niko
    Le Roux, Jonathan
    Hori, Takaaki
    [J]. INTERSPEECH 2021, 2021, : 726 - 730
  • [9] Spatial pseudo-labeling for semi-supervised facies classification
    Asghar, Saleem
    Choi, Junhwan
    Yoon, Daeung
    Byun, Joongmoo
    [J]. JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2020, 195
  • [10] Pseudo-Labeling Based Practical Semi-Supervised Meta-Training for Few-Shot Learning
    Dong, Xingping
    Ouyang, Tianran
    Liao, Shengcai
    Du, Bo
    Shao, Ling
    [J]. IEEE Transactions on Image Processing, 2024, 33 : 5663 - 5675