Product Bundle Identification using Semi-Supervised Learning

被引:12
|
作者
Tzaban, Hen [1 ]
Guy, Ido [2 ]
Greenstein-Messica, Asnat [1 ]
Dagan, Arnon [2 ]
Rokach, Lior [1 ]
Shapira, Bracha [1 ]
机构
[1] Ben Gurion Univ Negev, Beer Sheva, Israel
[2] eBay Res, Netanya, Israel
关键词
electronic commerce; ensemble learning; product bundling; self-training; semi-supervised learning; NOISE;
D O I
10.1145/3397271.3401128
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many sellers on e-commerce platforms offer buyers product bundles, which package together two or more different items. The identification of such bundles is a necessary step to support a variety of related services, from recommendation to dynamic pricing. In this work, we present a comprehensive study of bundle identification on a large e-commerce website. Our analysis of bundle compared to non-bundle listed items reveals several key differentiating characteristics, spanning the listing's title, image, and attributes. Following, we experiment with a multi-modal classifier, which takes advantage of these characteristics as features. Our analysis also shows that a bundle indicator input by sellers tends to be highly noisy and carries only a weak signal. The bundle identification task therefore faces the challenge of having a small set of manually-labeled clean examples and a larger set of noisy-labeled examples, in conjunction with class imbalance due to the relative scarcity of bundles. Our experiments with basic supervised classifiers, using the manually-labeled and/or the noisy-labeled data for training, demonstrates only moderate performance. We therefore turn to a semi-supervised approach and propose GREED, a self-training ensemble-based algorithm with a greedy model selection. Our evaluation over two different meta-categories shows a superior performance of semi-supervised approaches for the bundle identification task, with GREED outperforming several semi-supervised alternatives. The combination of textual, image, and some metadata features is shown to yield the best performance, reaching an AUC of 0.89 and 0.92 for the two meta-categories, respectively.
引用
收藏
页码:791 / 800
页数:10
相关论文
共 50 条
  • [41] Introduction to semi-supervised learning
    Goldberg, Xiaojin
    Synthesis Lectures on Artificial Intelligence and Machine Learning, 2009, 6 : 1 - 116
  • [42] Human Semi-Supervised Learning
    Gibson, Bryan R.
    Rogers, Timothy T.
    Zhu, Xiaojin
    TOPICS IN COGNITIVE SCIENCE, 2013, 5 (01) : 132 - 172
  • [43] Universal Semi-Supervised Learning
    Huang, Zhuo
    Xue, Chao
    Han, Bo
    Yang, Jian
    Gong, Chen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [44] A survey on semi-supervised learning
    Van Engelen, Jesper E.
    Hoos, Holger H.
    MACHINE LEARNING, 2020, 109 (02) : 373 - 440
  • [45] On Semi-Supervised Learning and Sparsity
    Balinsky, Alexander
    Balinsky, Helen
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 3083 - +
  • [46] Semi-supervised learning with trees
    Kemp, C
    Griffiths, TL
    Stromsten, S
    Tenenbaum, JB
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 257 - 264
  • [47] Incremental semi-supervised learning for intelligent seismic facies identification
    He Su-Mei
    Song Zhao-Hui
    Zhang Meng-Ke
    Yuan San-Yi
    Wang Shang-Xu
    Applied Geophysics, 2022, 19 : 41 - 52
  • [48] An IoT Device Identification Method based on Semi-supervised Learning
    Fan, Linna
    Zhang, Shize
    Wu, Yichao
    Wang, Zhiliang
    Duan, Chenxin
    Li, Jia
    Yang, Jiahai
    2020 16TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM), 2020,
  • [49] Important Object Identification with Semi-Supervised Learning for Autonomous Driving
    Li, Jiachen
    Gang, Haiming
    Ma, Hengbo
    Tomizuka, Masayoshi
    Choi, Chiho
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022, : 2913 - 2919
  • [50] A novel semi-supervised learning method for Internet application identification
    Chen, Zhenxiang
    Liu, Zhusong
    Peng, Lizhi
    Wang, Lin
    Zhang, Lei
    SOFT COMPUTING, 2017, 21 (08) : 1963 - 1975