L-CoDe: Language-based Colorization Using Color-object Decoupled Conditions

被引:0
|
作者
Weng, Shuchen [1 ]
Wu, Hao [2 ]
Chang, Zheng [3 ]
Tang, Jiajun [1 ]
Li, Si [3 ]
Shi, Boxin [1 ,4 ,5 ,6 ]
机构
[1] Peking Univ, Sch Comp Sci, Beijing, Peoples R China
[2] Peking Univ, Sch Software & Microelect, Beijing, Peoples R China
[3] Peking Univ, Sch Artificial Intelligence, Beijing, Peoples R China
[4] Peking Univ, Inst Artificial Intelligence, Beijing, Peoples R China
[5] Beijing Acad Artificial Intelligence, Beijing, Peoples R China
[6] Peng Cheng Lab, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Colorizing a grayscale image is inherently an ill-posed problem with multi-modal uncertainty. Language-based colorization offers a natural way of interaction to reduce such uncertainty via a user-provided caption. However, the color-object coupling and mismatch issues make the mapping from word to color difficult. In this paper, we propose L-CoDe, a Language-based Colorization network using color-object Decoupled conditions. A predictor for object-color corresponding matrix (OCCM) and a novel attention transfer module (ATM) are introduced to solve the color-object coupling problem. To deal with color-object mismatch that results in incorrect color-object correspondence, we adopt a soft-gated injection module (SIM). We further present a new dataset containing annotated color-object pairs to provide supervisory signals for resolving the coupling problem. Experimental results show that our approach outperforms state-of-the-art methods conditioned on captions.
引用
下载
收藏
页码:2677 / 2684
页数:8
相关论文
共 4 条
  • [1] L-CoDer: Language-Based Colorization with Color-Object Decoupling Transformer
    Chang, Zheng
    Weng, Shuchen
    Li, Yu
    Li, Si
    Shi, Boxin
    COMPUTER VISION - ECCV 2022, PT XVIII, 2022, 13678 : 360 - 375
  • [2] L-CoIns: Language-based Colorization with Instance Awareness
    Chang, Zheng
    Weng, Shuchen
    Zhang, Peixuan
    Li, Yu
    Li, Si
    Shi, Boxin
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19221 - 19230
  • [3] L-CAD: Language-based Colorization with Any-level Descriptions using Diffusion Priors
    Chang, Zheng
    Weng, Shuchen
    Zhang, Peixuan
    Li, Yu
    Li, Si
    Shi, Boxin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] A Survey of Natural Language-Based Editing of Low-Code Applications Using Large Language Models
    Gorissen, Simon Cornelius
    Sauer, Stefan
    Beckmann, Wolf G.
    HUMAN-CENTERED SOFTWARE ENGINEERING, HCSE 2024, 2024, 14793 : 243 - 254