Cross-domain personalized image captioning

被引:0
|
作者
Cuirong Long
Xiaoshan Yang
Changsheng Xu
机构
[1] HeFei University of Technology,
[2] Institute of Automation,undefined
[3] Chinese Academy of Sciences,undefined
[4] University of Chinese Academy of Sciences,undefined
来源
关键词
Personalization; Image captioning; Domain adaptation;
D O I
暂无
中图分类号
学科分类号
摘要
Image captioning aims to translate an image to a complete and natural sentence. It involves both computer vision and natural language processing. Though image captioning has achieved good results under the rapid development of deep neural networks, excessively pursuing the evaluation results of the captioning models makes the generated text description too conservative in practical applications. It is necessary to increase the diversity of the text description and account for prior knowledge such as the user’s favorite vocabularies and writing styles. In this paper, we study the personalized image captioning which can generate sentences to describe the user’s own story and feelings of life with the most preferred word expression. Moreover, we propose cross-domain personalized image captioning (CDPIC) to learn domain-invariant captioning models which can be applied on different social media platforms. The proposed method can flexibly model user interest by embedding the user ID as an interest vector. To the best of our knowledge, we propose the first cross-domain personalized image captioning approach by combining the user interest modeling and a simple and effective domain-invariant constraint. The effectiveness of the proposed method is verified on datasets from the Instagram and Lookbook platforms.
引用
收藏
页码:33333 / 33348
页数:15
相关论文
共 50 条
  • [1] Cross-domain personalized image captioning
    Long, Cuirong
    Yang, Xiaoshan
    Xu, Changsheng
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (45-46) : 33333 - 33348
  • [2] Multitask Learning for Cross-Domain Image Captioning
    Yang, Min
    Zhao, Wei
    Xu, Wei
    Feng, Yabing
    Zhao, Zhou
    Chen, Xiaojun
    Lei, Kai
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (04) : 1047 - 1061
  • [3] Dual Learning for Cross-domain Image Captioning
    Zhao, Wei
    Xu, Wei
    Yang, Min
    Ye, Jianbo
    Zhao, Zhou
    Feng, Yabing
    Qiao, Yu
    [J]. CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 29 - 38
  • [4] Cross-Domain Image Captioning with Discriminative Finetuning
    Dessi, Roberto
    Bevilacqua, Michele
    Gualdoni, Eleonora
    Carraz Rakotonirina, Nathanael
    Franzon, Francesca
    Baroni, Marco
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6935 - 6944
  • [5] Discriminative Style Learning for Cross-Domain Image Captioning
    Yuan, Jin
    Zhu, Shuai
    Huang, Shuyin
    Zhang, Hanwang
    Xiao, Yaoqiang
    Li, Zhiyong
    Wang, Meng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 1723 - 1736
  • [6] Learning Scene Graph for Better Cross-Domain Image Captioning
    Jia, Junhua
    Xin, Xiaowei
    Gao, Xiaoyan
    Ding, Xiangqian
    Pang, Shunpeng
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT III, 2024, 14427 : 121 - 137
  • [7] Cross-domain multi-style merge for image captioning
    Duan, Yiqun
    Wang, Zhen
    Li, Yi
    Wang, Jingya
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 228
  • [8] Cross-Domain Image Captioning via Cross-Modal Retrieval and Model Adaptation
    Zhao, Wentian
    Wu, Xinxiao
    Luo, Jiebo
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 1180 - 1192
  • [9] Cross-Domain Modality Fusion for Dense Video Captioning
    Aafaq, Nayyer
    Mian, Ajmal
    Liu, Wei
    Akhtar, Naveed
    Shah, Mubarak
    [J]. IEEE Transactions on Artificial Intelligence, 2022, 3 (05): : 763 - 777
  • [10] Personalized image annotation via class-specific cross-domain learning
    Qian, Zhiming
    Zhong, Ping
    Wang, Runsheng
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2015, 34 : 61 - 71