Feature-based augmentation and classification for tabular data

被引:9
|
作者
Sathianarayanan, Balachander [1 ]
Samant, Yogesh Chandra Singh [1 ]
Guruprasad, Prahalad S. Conjeepuram [1 ]
Hariharan, Varshin B. [1 ]
Manickam, Nirmala Devi [1 ]
机构
[1] Amrita Vishwa Vidyapeetham, Amrita Sch Engn, Coimbatore, Tamil Nadu, India
关键词
Classification (of information) - Learning systems;
D O I
10.1049/cit2.12123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generating synthetic samples for a tabular data is a strenuous task. Most of the time, the columns (features) in the dataset may not follow an ideal distribution function. The objective of the proposed algorithm, Histogram Augmentation Technique (HAT), is to generate a dataset whose distribution is similar to that of the original dataset. This augmentation is achieved based on individual columns, where separate algorithms are designed for continuous and discrete columns. Humans also use features of an object for interpretation. When humans make a judgement, they notice prominent features and characterise the perceived object. However, conventional Machine Learning classifiers are designed and trained on the basis of samples. Taking the features as the basis for classification, Feature Importance Classifier (FIC) has been attempted in this work. FIC treats every feature independent of each other, and ranks the features based on its dependence with the classified label. It has been found that the FIC has the highest accuracy and has improved the accuracy by 5.54% on average, when it's compared to other classifiers. The suggested algorithms have been experimented on five datasets and compared with two augmentation algorithms and four state-of-the-art ML classification algorithms.
引用
收藏
页码:481 / 491
页数:11
相关论文
共 50 条
  • [1] Land Cover Classification based on Deep Convolutional Neural Network with Feature-based Data Augmentation
    Wang, Bo
    Huang, Chengeng
    Guo, Yuhua
    Tao, Jiahui
    [J]. JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY, 2021, 65 (01)
  • [2] Weighted Feature-based Classification of Time Series Data
    Ravikumar, Penugonda
    Devi, V. Susheela
    [J]. 2014 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING (CIDM), 2014, : 222 - 228
  • [3] Synthetic Augmentation and Feature-Based Filtering for Improved Cervical Histopathology Image Classification
    Xue, Yuan
    Zhou, Qianying
    Ye, Jiarong
    Long, L. Rodney
    Antani, Sameer
    Cornwell, Carl
    Xue, Zhiyun
    Huang, Xiaolei
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT I, 2019, 11764 : 387 - 396
  • [4] A PROPOSAL FOR FEATURE CLASSIFICATION IN FEATURE-BASED DESIGN
    OVTCHAROVA, J
    PAHL, G
    RIX, J
    [J]. COMPUTERS & GRAPHICS, 1992, 16 (02) : 187 - 195
  • [5] Feature-Based Lung Nodule Classification
    Farag, Amal
    Ali, Asem
    Graham, James
    Elhabian, Shireen
    Farag, Aly
    Falk, Robert
    [J]. ADVANCES IN VISUAL COMPUTING, PT III, 2010, 6455 : 79 - +
  • [6] Feature-Based Dissimilarity Space Classification
    Duin, Robert P. W.
    Loog, Marco
    Pekalska, Elzbieta
    Tax, David M. J.
    [J]. RECOGNIZING PATTERNS IN SIGNALS, SPEECH, IMAGES, AND VIDEOS, 2010, 6388 : 46 - +
  • [7] STATISTICAL FEATURE-BASED CRAQUELURE CLASSIFICATION
    Crisologo, Irene
    Monterola, Christopher
    Soriano, Maricor
    [J]. INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 2011, 22 (11): : 1191 - 1209
  • [8] Feature-Based Terrain Classification For LittleDog
    Filitchkin, Paul
    Byl, Katie
    [J]. 2012 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2012, : 1387 - 1392
  • [9] EndNote: Feature-based classification of networks
    Barnett, Ian
    Malik, Nishant
    Kuijjer, Marieke L.
    Mucha, Peter J.
    Onnela, Jukka-Pekka
    [J]. NETWORK SCIENCE, 2019, 7 (03) : 438 - 444
  • [10] Optimising the classification of feature-based attention in frequency-tagged electroencephalography data
    Renton, Angela I.
    Painter, David R.
    Mattingley, Jason B.
    [J]. SCIENTIFIC DATA, 2022, 9 (01)