Improving fashion captioning via attribute-based alignment and multi-level language model

被引:1
|
作者
Tang, Yuhao [1 ]
Zhang, Liyan [1 ]
Yuan, Ye [1 ]
Chen, Zhixian [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing 211106, Peoples R China
基金
中国国家自然科学基金;
关键词
Fashion; Image captioning; E-commerce;
D O I
10.1007/s10489-023-05167-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fashion captioning aims to generate detailed and captivating descriptions based on a group of item images. It requires the model to precisely describe attribute details under the supervision of complex sentences. Existing image captioning methods typically focus on describing a single image and often struggle to capture fine-grained visual representations in the fashion domain. Furthermore, the presence of complex description noise and unbalanced word distribution in fashion datasets limits diverse sentence generation. To alleviate redundancy in raw images, we propose an Attribute-based Alignment Module (AAM). The AAM captures more content-related information to enhance visual representations. Based on this design, we demonstrate that fashion captioning can benefit greatly from grid features with detailed alignment, in contrast to previous success with dense features. To address the inherent word distribution imbalance, we introduce a more balanced corpus called Fashion-Style-27k, collected from various shopping websites. Additionally, we present a pre-trained Fashion Language Model (FLM) that integrates sentence-level and attribute-level language knowledge into the caption model. Experiments on the FACAD and Fashion-Gen datasets show the proposed AAM-FLM outperforms existing methods. Descriptions in the two datasets are from considerably different lengths and styles, ranging from the 21-word detailed description to the 30-word template-based sentence, demonstrating the generalization ability of the proposed model.
引用
收藏
页码:30757 / 30777
页数:21
相关论文
共 50 条
  • [41] Correction to "Improving Privacy and Security in Decentralizing Multi-Authority Attribute-Based Encryption in Cloud Computing''
    Tan, Syh-Yuan
    IEEE ACCESS, 2019, 7 : 17045 - 17049
  • [42] Person re-identification based on multi-level and generated alignment network
    Chong, Yanwen
    Zhang, Chen
    Feng, Wenqiang
    Pan, Shaoming
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2022, 50 (04): : 64 - 70
  • [43] A multi-level security model based on trusted computing
    Jia, Zhao
    Liu Ji-qiang
    Jing, Chen
    PROCEEDINGS OF THE FIRST INTERNATIONAL SYMPOSIUM ON DATA, PRIVACY, AND E-COMMERCE, 2007, : 448 - +
  • [44] Image-text matching algorithm based on multi-level semantic alignment
    Li Y.
    Yao T.
    Zhang L.
    Sun Y.
    Fu H.
    Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2024, 50 (02): : 551 - 558
  • [45] Multi-level Security System Verification Based on the Model
    Stasiak, Andrzej
    Zielinski, Zbigniew
    ENGINEERING SOFTWARE SYSTEMS: RESEARCH AND PRAXIS, 2019, 830 : 69 - 85
  • [46] Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning
    Cheng, Xuxin
    Xu, Wanshi
    Zhu, Zhihong
    Li, Hongxiang
    Zou, Yuexian
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 326 - 336
  • [47] Improving the Characteristics of Multi-Level LUT-Based Mealy FSMs
    Barkalov, Alexander
    Titarenko, Larysa
    Krzywicki, Kazimierz
    Saburova, Svetlana
    ELECTRONICS, 2020, 9 (11) : 1 - 34
  • [48] Data Processing Model in Hierarchical Multi-agent System Based on Decentralized Attribute-Based Encryption
    Nyrkov, Andrey
    Romanova, Yulia
    Ianiushkin, Konstantin
    Li, Izolda
    INTERNATIONAL SCIENTIFIC CONFERENCE ENERGY MANAGEMENT OF MUNICIPAL FACILITIES AND SUSTAINABLE ENERGY TECHNOLOGIES, EMMFT 2018, VOL 1, 2020, 982 : 429 - 438
  • [49] A Novel Security Architecture Based on Multi-level Rule Expression Language
    Souissi, Samih
    Sliman, Layth
    Charroux, Benoit
    HYBRID INTELLIGENT SYSTEMS, HIS 2015, 2016, 420 : 259 - 269
  • [50] Multi-level Matching of Natural Language-Based Vehicle Retrieval
    Liu, Ying
    Zhang, Zhongshuai
    Yang, Xiaochun
    WEB AND BIG DATA, PT III, APWEB-WAIM 2023, 2024, 14333 : 358 - 372