An End-to-End Attention-Based Neural Model for Complementary Clothing Matching

被引:9
|
作者
Liu, Jinhuan [1 ]
Song, Xuemeng [1 ]
Nie, Liqiang [1 ]
Gan, Tian [1 ]
Ma, Jun [1 ]
机构
[1] Shandong Univ, Sch Comp Sci & Technol, Qingdao 266237, Peoples R China
基金
中国国家自然科学基金;
关键词
End-to-end; feature-level attention; complementary clothing matching;
D O I
10.1145/3368071
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In modern society, people tend to prefer fashionable and decent outfits that can meet more than basic physiological needs. In fact, a proper outfit usually relies on good matching among complementary fashion items (e.g., the top, bottom, and shoes) that compose it, which thus propels us to investigate the automatic complementary clothing matching scheme. However, this is non-trivial due to the following challenges. First, the main challenge lies in how to accurately model the compatibility between complementary fashion items (e.g., the top and bottom) that come from the heterogeneous spaces with multi-modalities (e.g., the visual modality and textual modality). Second, since different features (e.g., the color, style, and pattern) of fashion items may contribute differently to compatibility modeling, how to encode the confidence of different pairwise features presents a tough challenge. Third, how to jointly learn the latent representation of multi-modal data and the compatibility between complementary fashion items contributes to the last challenge. Toward this end, in this work, we present an end-to-end attention-based neural framework for the compatibility modeling, where we introduce a feature-level attention model to adaptively learn the confidence for different pairwise features. Extensive experiments on a public available real-world dataset show the superiority of our model over state-of-the-art methods.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Attention-based neural network for end-to-end music separation
    Wang, Jing
    Liu, Hanyue
    Ying, Haorong
    Qiu, Chuhan
    Li, Jingxin
    Anwar, Muhammad Shahid
    [J]. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2023, 8 (02) : 355 - 363
  • [2] EXPLORING END-TO-END ATTENTION-BASED NEURAL NETWORKS FOR NATIVE LANGUAGE IDENTIFICATION
    Ubale, Rutuja
    Qian, Yao
    Evanini, Keelan
    [J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 84 - 91
  • [3] End-to-end Language Identification using Attention-based Recurrent Neural Networks
    Geng, Wang
    Wang, Wenfu
    Zhao, Yuanyuan
    Cai, Xinyuan
    Xu, Bo
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2944 - 2948
  • [4] Attention-based end-to-end image defogging network
    Yang, Yan
    Zhang, Chen
    Jiang, Peipei
    Yue, Hui
    [J]. ELECTRONICS LETTERS, 2020, 56 (15) : 759 - +
  • [5] Attention-Based Encoder-Decoder End-to-End Neural Diarization With Embedding Enhancer
    Chen, Zhengyang
    Han, Bing
    Wang, Shuai
    Qian, Yanmin
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1636 - 1649
  • [6] AN ATTENTION-BASED JOINT ACOUSTIC AND TEXT ON-DEVICE END-TO-END MODEL
    Sainath, Tara N.
    Pang, Ruoming
    Weiss, Ron J.
    He, Yanzhang
    Chiu, Chung-cheng
    Strohman, Trevor
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7039 - 7043
  • [7] END-TO-END ATTENTION-BASED LARGE VOCABULARY SPEECH RECOGNITION
    Bandanau, Dzmitry
    Chorowski, Jan
    Serdyuk, Dmitriy
    Brakel, Philemon
    Bengio, Yoshua
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 4945 - 4949
  • [8] Speaker Adaptation for Attention-Based End-to-End Speech Recognition
    Meng, Zhong
    Gaur, Yashesh
    Li, Jinyu
    Gong, Yifan
    [J]. INTERSPEECH 2019, 2019, : 241 - 245
  • [9] ATTENTION-BASED END-TO-END SPEECH RECOGNITION ON VOICE SEARCH
    Shan, Changhao
    Zhang, Junbo
    Wang, Yujun
    Xie, Lei
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4764 - 4768
  • [10] Improving Attention-based End-to-end ASR by Incorporating an N-gram Neural Network
    Ao, Junyi
    Ko, Tom
    [J]. 2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,