Product Bundle Identification using Semi-Supervised Learning

被引:12
|
作者
Tzaban, Hen [1 ]
Guy, Ido [2 ]
Greenstein-Messica, Asnat [1 ]
Dagan, Arnon [2 ]
Rokach, Lior [1 ]
Shapira, Bracha [1 ]
机构
[1] Ben Gurion Univ Negev, Beer Sheva, Israel
[2] eBay Res, Netanya, Israel
关键词
electronic commerce; ensemble learning; product bundling; self-training; semi-supervised learning; NOISE;
D O I
10.1145/3397271.3401128
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many sellers on e-commerce platforms offer buyers product bundles, which package together two or more different items. The identification of such bundles is a necessary step to support a variety of related services, from recommendation to dynamic pricing. In this work, we present a comprehensive study of bundle identification on a large e-commerce website. Our analysis of bundle compared to non-bundle listed items reveals several key differentiating characteristics, spanning the listing's title, image, and attributes. Following, we experiment with a multi-modal classifier, which takes advantage of these characteristics as features. Our analysis also shows that a bundle indicator input by sellers tends to be highly noisy and carries only a weak signal. The bundle identification task therefore faces the challenge of having a small set of manually-labeled clean examples and a larger set of noisy-labeled examples, in conjunction with class imbalance due to the relative scarcity of bundles. Our experiments with basic supervised classifiers, using the manually-labeled and/or the noisy-labeled data for training, demonstrates only moderate performance. We therefore turn to a semi-supervised approach and propose GREED, a self-training ensemble-based algorithm with a greedy model selection. Our evaluation over two different meta-categories shows a superior performance of semi-supervised approaches for the bundle identification task, with GREED outperforming several semi-supervised alternatives. The combination of textual, image, and some metadata features is shown to yield the best performance, reaching an AUC of 0.89 and 0.92 for the two meta-categories, respectively.
引用
收藏
页码:791 / 800
页数:10
相关论文
共 50 条
  • [1] Speaker Identification Using Semi-supervised Learning
    Fazakis, Nikos
    Karlos, Stamatis
    Kotsiantis, Sotiris
    Sgarbas, Kyriakos
    SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 389 - 396
  • [2] Semi-supervised framework for writer identification using structural learning
    Porwal, Utkarsh
    Govindaraju, Venu
    IET BIOMETRICS, 2013, 2 (04) : 208 - 215
  • [3] Robust identification of molecular phenotypes using semi-supervised learning
    Roder, Heinrich
    Oliveira, Carlos
    Net, Lelia
    Linstid, Benjamin
    Tsypin, Maxim
    Roder, Joanna
    BMC BIOINFORMATICS, 2019, 20 (1)
  • [4] Contaminant source identification using semi-supervised machine learning
    Vesselinov, Velimir V.
    Alexandrov, Boian S.
    O'Malley, Daniel
    JOURNAL OF CONTAMINANT HYDROLOGY, 2018, 212 : 134 - 142
  • [5] Robust identification of molecular phenotypes using semi-supervised learning
    Heinrich Roder
    Carlos Oliveira
    Lelia Net
    Benjamin Linstid
    Maxim Tsypin
    Joanna Roder
    BMC Bioinformatics, 20
  • [6] On the identification of thyroid nodules using semi-supervised deep learning
    Turk, Gamze
    Ozdemir, Mustafa
    Zeydan, Ruken
    Turk, Yekta
    Bilgin, Zeki
    Zeydan, Engin
    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING, 2021, 37 (03)
  • [7] Semi-supervised bundle manifold learning for hyperspectral image classification
    Li, Zhi-Min
    Zhang, Jie
    Huang, Hong
    Jiang, Tao
    Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2015, 23 (05): : 1434 - 1442
  • [8] Supervised and Semi-Supervised Learning for Failure Identification in Microwave Networks
    Musumeci, Francesco
    Magni, Luca
    Ayoub, Omran
    Rubino, Roberto
    Capacchione, Massimiliano
    Rigamonti, Gabriele
    Milano, Michele
    Passera, Claudio
    Tornatore, Massimo
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2021, 18 (02): : 1934 - 1945
  • [9] On semi-supervised learning
    A. Cholaquidis
    R. Fraiman
    M. Sued
    TEST, 2020, 29 : 914 - 937
  • [10] On semi-supervised learning
    Cholaquidis, A.
    Fraiman, R.
    Sued, M.
    TEST, 2020, 29 (04) : 914 - 937