Repurposing existing deep networks for caption and aesthetic-guided image cropping

被引:6
|
作者
Horanyi, Nora [1 ]
Xia, Kedi [2 ]
Yi, Kwang Moo [3 ]
Bojja, Abhishake Kumar [3 ]
Leonardis, Ales [1 ]
Chang, Hyung Jin [1 ]
机构
[1] Univ Birmingham, Birmingham, W Midlands, England
[2] Zhejiang Univ, Hangzhou, Peoples R China
[3] Univ Victoria, Victoria, BC, Canada
基金
加拿大自然科学与工程研究理事会; 英国工程与自然科学研究理事会;
关键词
Image cropping; Aesthetics; Deep network re-purposing; Image captioning; ALGORITHM;
D O I
10.1016/j.patcog.2021.108485
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel optimization framework that crops a given image based on user description and aesthetics. Unlike existing image cropping methods, where one typically trains a deep network to regress to crop parameters or cropping actions, we propose to directly optimize for the cropping parameters by re purposing pre-trained networks on image captioning and aesthetic tasks, without any fine-tuning, thereby avoiding training a separate network. Specifically, we search for the best crop parameters that minimize a combined loss of the initial objectives of these networks. To make the optimization stable, we propose three strategies: (i) multi-scale bilinear sampling, (ii) annealing the scale of the crop region, therefore effectively reducing the parameter space, (iii) aggregation of multiple optimization results. Through various quantitative and qualitative evaluations, we show that our framework can produce crops that are well-aligned to intended user descriptions and aesthetically pleasing. (c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:10
相关论文
共 22 条
  • [1] Aesthetic-guided Outward Image Cropping
    Zhong, Lei
    Li, Feng-Heng
    Huang, Hao-Zhi
    Zhang, Yong
    Lu, Shao-Ping
    Wang, Jue
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2021, 40 (06):
  • [2] Aesthetic guided deep regression network for image cropping
    Lu, Peng
    Zhang, Hao
    Peng, XuJun
    Peng, Xiang
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2019, 77 : 1 - 10
  • [3] Deep Neural Networks for Efficient Image Caption Generation
    Rai, Riddhi
    Guruprasad, Navya Shimoga
    Tumuluru, Shreya Sindhu
    [J]. ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2023, PT II, 2024, 2091 : 247 - 260
  • [4] Automatic Image Cropping for Visual Aesthetic Enhancement Using Deep Neural Networks and Cascaded Regression
    Guo, Guanjun
    Wang, Hanzi
    Shen, Chunhua
    Yan, Yan
    Liao, Hong-Yuan Mark
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (08) : 2073 - 2085
  • [5] Composition-Guided Neural Network for Image Cropping Aesthetic Assessment
    Ni, Shijia
    Shao, Feng
    Chai, Xiongli
    Chen, Hangwei
    Ho, Yo-Sung
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 6836 - 6851
  • [6] Sentence Learning on Deep Convolutional Networks for Image Caption Generation
    Kim, Dong-Jin
    Yoo, Donggeun
    Sim, Bonggeun
    Kweon, In So
    [J]. 2016 13TH INTERNATIONAL CONFERENCE ON UBIQUITOUS ROBOTS AND AMBIENT INTELLIGENCE (URAI), 2016, : 246 - 247
  • [7] Ensemble Learning on Deep Neural Networks for Image Caption Generation
    Katpally, Harshitha
    Bansal, Ajay
    [J]. 2020 IEEE 14TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2020), 2020, : 61 - 68
  • [8] User-Guided Personalized Image Aesthetic Assessment Based on Deep Reinforcement Learning
    Lv, Pei
    Fan, Jianqi
    Nie, Xixi
    Dong, Weiming
    Jiang, Xiaoheng
    Zhou, Bing
    Xu, Mingliang
    Xu, Changsheng
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 736 - 749
  • [9] Saliency-Guided Deep Neural Networks for SAR Image Change Detection
    Geng, Jie
    Ma, Xiaorui
    Zhou, Xiaojun
    Wang, Hongyu
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (10): : 7365 - 7377
  • [10] Enhanced Visual Attention-Guided Deep Neural Networks for Image Classification
    Yeh, Chia-Hung
    Lin, Min-Hui
    Chang, Po-Chao
    Kang, Li-Wei
    [J]. IEEE ACCESS, 2020, 8 (08) : 163447 - 163457