LbR: A New Regression Architecture for Automated Feature Engineering

被引:1
|
作者
Wang, Meng [1 ]
Ding, Zhijun [1 ]
Pan, Meiqin [2 ]
机构
[1] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China
[2] Shanghai Int Studies Univ, Sch Business & Management, Shanghai 200083, Peoples R China
关键词
automatic feature engineering; label; regression; feature pairs; correlations;
D O I
10.1109/ICDMW51313.2020.00066
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, machine learning has developed rapidly and has been widely applied in many fields, such as finance and medical treatment. Many studies have shown that feature engineering is the most important part of machine learning and the most creative part of data science. However, in the traditional feature engineering step, it often requires the participation of experienced domain experts and is very time-consuming. Therefore, automatic feature engineering technology arises, aiming at improving the performance of the model by automatically generating high informative features without expert domain knowledge. However, in these methods, new features are generated by pre-defining a set of identical operators on datasets, ignoring the diversity of data sets. So there is room for improvement in performance. In this paper, we proposed a method named LbR (Label based Regression), which can fully mine correlations between feature pairs and then select feature pairs with high discrimination to generate informative features. We conducted many experiments to show that LbR has better performance and efficiency than other methods in different data sets and machine learning models.
引用
收藏
页码:432 / 439
页数:8
相关论文
共 50 条
  • [11] Symbolic regression as a feature engineering method for machine and deep learning regression tasks
    Shmuel, Assaf
    Glickman, Oren
    Lazebnik, Teddy
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (02):
  • [12] Automated Feature Engineering for AutoML Using Genetic Algorithms
    Shi, Kevin
    Saad, Sherif
    PROCEEDINGS OF THE 20TH INTERNATIONAL CONFERENCE ON SECURITY AND CRYPTOGRAPHY, SECRYPT 2023, 2023, : 450 - 459
  • [13] Towards Automated Comprehensive Feature Engineering for Spam Detection
    Kiwanuka, Fred N.
    Alqatawna, Ja'far
    Amin, Anang Hudaya Muhamad
    Paul, Sujni
    Faris, Hossam
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS SECURITY AND PRIVACY (ICISSP), 2019, : 429 - 437
  • [14] Automated Grammar-based Feature Selection in Symbolic Regression
    Ali, Muhammad Sarmad
    Kshirsagar, Meghana
    Naredo, Enrique
    Ryan, Conor
    PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'22), 2022, : 902 - 910
  • [15] Correlated Regression Feature Learning for Automated Right Ventricle Segmentation
    Chen, Jun
    Zhang, Heye
    Zhang, Weiwei
    Du, Xiuquan
    Zhang, Yanping
    Li, Shuo
    IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE, 2018, 6
  • [16] A New Approach for Automated Feature Selection
    Gocht, Andreas
    Lehmann, Christoph
    Schoene, Robert
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 4915 - 4920
  • [17] The autofeat Python']Python Library for Automated Feature Engineering and Selection
    Horn, Franziska
    Pack, Robert
    Rieger, Michael
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT I, 2020, 1167 : 111 - 120
  • [18] Automated Feature Recognition System for supporting conceptual engineering design
    Jones, T. J.
    Reidsema, C.
    Smith, A.
    INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2006, 10 (06) : 477 - 492
  • [19] Automated feature engineering for prediction of victories in online computer games
    Ruta, Dymitr
    Cen, Ling
    Liu, Ming
    Quang Hieu Vu
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5672 - 5682
  • [20] An Automated Feature Engineering Method for Online Payment Fraud Detection
    Wang C.
    Wang C.-Q.
    Jisuanji Xuebao/Chinese Journal of Computers, 2020, 43 (10): : 1983 - 2001