SAWTab: Smoothed Adaptive Weighting for Tabular Data in Semi-supervised Learning

被引:0
|
作者
Gharasuie, Morteza Mohammady [1 ]
Wang, Fengjiao [2 ]
Sharif, Omar [1 ]
Mukkamala, Ravi [1 ]
机构
[1] Old Dominion Univ, Norfolk, VA 23529 USA
[2] Univ Utah, Salt Lake City, UT 84112 USA
关键词
Semi-supervised learning; Feature representation; Pseudo-label; Tabular domain; adaptive weighting;
D O I
10.1007/978-981-97-2259-4_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-supervised and Semi-supervised learning (SSL) on tabular data is an understudied topic. Despite some attempts, there are two major challenges: 1. Imbalanced nature in the tabular dataset; 2. The one-hot encoding used in these methods becomes less efficient for high-cardinality categorical features. To cope with the challenges, we propose SAWTab which uses a target encoding method, Conditional Probability Representation (CPR), for efficient representation in the input space of categorical features. We improve this representation by incorporating the unlabeled samples through pseudo-labels. Furthermore, we propose a Smooth Adaptive Weighting mechanism in the target encoding to mitigate the issue of noisy and biased pseudo-labels. Experimental results on various datasets and comparisons with existing frameworks show that SAWTab yields best test accuracy on all datasets. We find that pseudo-labels can help improve the input space representation in the SSL setting, which enhances the generalization of the learning algorithm.
引用
收藏
页码:316 / 328
页数:13
相关论文
共 50 条
  • [21] Data heterogeneity consideration in semi-supervised learning
    Araujo, Bilza
    Zhao, Liang
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2016, 45 : 234 - 247
  • [22] Distributed Semi-Supervised Learning With Missing Data
    Xu, Zhen
    Liu, Ying
    Li, Chunguang
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (12) : 6165 - 6178
  • [23] Online Semi-Supervised Learning with Adaptive Vector Quantization
    Shen, Yuan-Yuan
    Zhang, Xu-Yao
    Liu, Cheng-Lin
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE (ICPRAI 2018), 2018, : 461 - 466
  • [24] Adaptive Graph Constrained NMF for Semi-Supervised Learning
    Li, Qian
    Jing, Liping
    Yu, Jian
    [J]. PARTIALLY SUPERVISED LEARNING, PSL 2013, 2013, 8193 : 36 - 48
  • [25] Semi-Supervised Learning Based on Local Adaptive Kernels
    Niu, Guo
    Gu, Yanchun
    Duan, Zhikui
    [J]. 2019 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI 2019), 2019, : 401 - 405
  • [26] Adaptive and structured graph learning for semi-supervised clustering
    Chen, Long
    Zhong, Zhi
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (04)
  • [27] Adaptive Safe Semi-Supervised Extreme Machine Learning
    Ma, Jun
    Yuan, Chao
    [J]. IEEE ACCESS, 2019, 7 : 76176 - 76184
  • [28] Adaptive Graph Learning for Semi-supervised Classification of GCNs
    Wan, Yingying
    Zhan, Mengmeng
    Li, Yangding
    [J]. DATABASES THEORY AND APPLICATIONS (ADC 2021), 2021, 12610 : 13 - 22
  • [29] Semi-Supervised Feature Selection with Adaptive Graph Learning
    Jiang, Bing-Bing
    He, Wen-Da
    Wu, Xing-Yu
    Xiang, Jun-Hao
    Hong, Li-Bin
    Sheng, Wei-Guo
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2022, 50 (07): : 1643 - 1652
  • [30] SEMI-SUPERVISED LEARNING WITH BIDIRECTIONAL ADAPTIVE PAIRWISE ENCODING
    Yuan, Jiangbo
    Yu, Jie
    [J]. 2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 677 - 681