e-CLIP: Large-Scale Vision-Language Representation Learning in E-commerce

被引:8
|
作者
Shin, Wonyoung [1 ]
Park, Jonghun [1 ]
Woo, Taekang [1 ]
Cho, Yongwoo [1 ]
Oh, Kwangjin [1 ]
Song, Hwanjun [2 ]
机构
[1] NAVER Shopping, Seongnam, South Korea
[2] NAVER AI Res, Seongnam, South Korea
关键词
Multimodal pre-training; Large-scale pre-training;
D O I
10.1145/3511808.3557067
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Understanding vision and language representations of product content is vital for search and recommendation applications in e-commerce. As a backbone for online shopping platforms and inspired by the recent success in representation learning research, we propose a contrastive learning framework that aligns language and visual models using unlabeled raw product text and images. We present techniques we used to train large-scale representation learning models and share solutions that address domain-specific challenges. We study the performance using our pre-trained model as backbones for diverse downstream tasks, including category classification, attribute extraction, product matching, product clustering, and adult product recognition. Experimental results show that our proposed method outperforms the baseline in each downstream task regarding both single modality and multiple modalities.
引用
收藏
页码:3484 / 3494
页数:11
相关论文
共 50 条
  • [31] Research on the Large-scale E-commerce Platform Development Mode Based on Oracle Database and Java']Java Programming Language
    Wang, Meiyan
    2015 3RD INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL SCIENCE, HUMANITIES, AND MANAGEMENT, ASSHM 2015, 2015, : 1082 - 1091
  • [32] The integration of e-learning and e-commerce
    Gang, Wang Jian
    Li, Wang Xiao
    2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL VII, 2010, : 63 - 66
  • [33] The integration of e-learning and e-commerce
    Gang, Wang Jian
    Li, Wang Xiao
    2011 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION AND INDUSTRIAL APPLICATION (ICIA2011), VOL II, 2011, : 63 - 66
  • [35] Cross-Graph Convolution Learning for Large-Scale Text-Picture Shopping Guide in E-Commerce Search
    Zhang, Tong
    Cui, Baoliang
    Cui, Zhen
    Huang, Haikuan
    Yang, Jian
    Deng, Hongbo
    Zheng, Bo
    2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020), 2020, : 1657 - 1666
  • [36] COSMO: A Large-Scale E-commerce Common Sense Knowledge Generation and Serving System at Amazon
    Yu, Changlong
    Liu, Xin
    Maia, Jefferson
    Li, Yang
    Cao, Tianyu
    Gao, Yifan
    Song, Yangqiu
    Goutam, Rahul
    Zhang, Haiyang
    Yin, Bing
    Li, Zheng
    COMPANION OF THE 2024 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, SIGMOD-COMPANION 2024, 2024, : 148 - 160
  • [37] X-Engine: An Optimized Storage Engine for Large-scale E-commerce Transaction Processing
    Huang, Gui
    Cheng, Xuntao
    Wang, Jianying
    Wang, Yujie
    He, Dengcheng
    Zhang, Tieying
    Li, Feifei
    Wang, Sheng
    Cao, Wei
    Li, Qiang
    SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 651 - 665
  • [38] Research on Business Model Innovation of the Traditional Large-scale Retail Enterprises' Transition to the E-commerce
    Lv, Xiaoping
    Liu, Xiaoli
    PROCEEDING OF 2012 INTERNATIONAL SYMPOSIUM ON MANAGEMENT OF TECHNOLOGY (ISMOT'2012), 2012, : 652 - 656
  • [39] Large-Scale E-Commerce Image Retrieval with Top-Weighted Convolutional Neural Networks
    Zhao, Shichao
    Xu, Youjiang
    Han, Yahong
    ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2016, : 285 - 288
  • [40] Automated Quality Evaluation of Large-Scale Benchmark Datasets for Vision-Language Tasks
    Zhao, Ruibin
    Xie, Zhiwei
    Zhuang, Yipeng
    L. H. Yu, Philip
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2024, 34 (03)