MSLR: A Self-supervised Representation Learning Method for Tabular Data Based on Multi-scale Ladder Reconstruction

被引:0
|
作者
Weng, Xutao [1 ]
Song, Hong [1 ]
Lin, Yucong [2 ]
Zhang, Xi [1 ]
Liu, Bowen [3 ]
Wu, You [3 ]
Yang, Jian [2 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China
[2] Beijing Inst Technol, Sch Opt & Photon, Beijing 100081, Peoples R China
[3] Beijing Inst Technol, Sch Med Technol, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
Tabular data; Representation learning; Self -supervised learning; Binning; Multi; -scale;
D O I
10.1016/j.ins.2024.120108
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Tabular data are widely used for prediction tasks, but they often suffer from the curse of dimensionality and noise, leading to degradation in the performance and robustness of prediction models. Self-supervised representation learning has emerged as a promising technique to overcome these challenges, but most existing methods are applicable to images, text, and others rather than tabular data. In this study, we propose a novel self-supervised representation learning method for tabular data based on multi-scale ladder reconstruction (MSLR). The method effectively learns low-dimensional and noise-resistant representations, thereby improving the prediction performance across various tabular datasets. The idea of MSLR is to employ a binning method to generate a sequence of fuzzy data with different noise scales, followed by training a neural network to recover the raw data from the most corrupted data in a circular manner. This process allows MSLR to learn fine-grained changes caused by noise while maintaining consistency and similarity at a coarse granularity. The proposed method is evaluated on five real-world datasets, namely, MIMIC-IV, Thyroid, Heart, Pima, and Adult, and compared with several baselines. The experimental results of downstream prediction tasks show that MSLR is robust to noisy data and performs better than other existing baseline methods.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning
    Ucar, Talip
    Hajiramezanali, Ehsan
    Edwards, Lindsay
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [2] Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning
    Jin, Ming
    Zheng, Yizhen
    Li, Yuan-Fang
    Gong, Chen
    Zhou, Chuan
    Pan, Shirui
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 1477 - 1483
  • [3] Self-supervised graph representation learning using multi-scale subgraph views contrast
    Chen, Lei
    Huang, Jin
    Li, Jingjing
    Cao, Yang
    Xiao, Jing
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (15): : 12559 - 12569
  • [4] Self-supervised graph representation learning using multi-scale subgraph views contrast
    Lei Chen
    Jin Huang
    Jingjing Li
    Yang Cao
    Jing Xiao
    Neural Computing and Applications, 2022, 34 : 12559 - 12569
  • [5] Self-supervised Multi-scale Consistency for Weakly Supervised Segmentation Learning
    Valvano, Gabriele
    Leo, Andrea
    Tsaftaris, Sotirios A.
    DOMAIN ADAPTATION AND REPRESENTATION TRANSFER, AND AFFORDABLE HEALTHCARE AND AI FOR RESOURCE DIVERSE GLOBAL HEALTH (DART 2021), 2021, 12968 : 14 - 24
  • [6] Progressive Multi-scale Self-supervised Learning for Speech Recognition
    Wan, Genshun
    Chen, Hang
    Liu, Tan
    Wang, Chenxi
    Pan, Jia
    Ye, Zhongfu
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 978 - 982
  • [7] Self-Supervised Graph Representation Learning Method Based on Data and Feature Augmentation
    Xu, Yunfeng
    Fan, Hexun
    Computer Engineering and Applications, 2024, 60 (17) : 148 - 157
  • [8] Multi-scale self-supervised representation learning with temporal alignment for multi-rate time series modeling☆
    Chen, Jiawei
    Song, Pengyu
    Zhao, Chunhui
    PATTERN RECOGNITION, 2024, 145
  • [9] A RECONSTRUCTION-BASED FEATURE ADAPTATION FOR ANOMALY DETECTION WITH SELF-SUPERVISED MULTI-SCALE AGGREGATION
    Zuo, Zuo
    Wu, Zongze
    Chen, Badong
    Zhong, Xiaopin
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 5840 - 5844
  • [10] Self-Supervised Tabular Data Anomaly Detection Method Based on Knowledge Enhancement
    Xiaoyu, Gao
    Xiaoyong, Zhao
    Lei, Wang
    Computer Engineering and Applications, 60 (10): : 140 - 147