Automated Machine Learning: State-of-The-Art and Open Challenges

被引:0
|
作者
Elshawi, Radwa [1 ]
Sakr, Sherif [1 ]
机构
[1] Univ Tartu, Data Syst Grp, Tartu, Estonia
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, machine learning techniques and algorithms are employed in almost every application domain (e.g., financial applications, advertising, recommendation systems, user behavior analytics). In practice, they are playing a crucial role in harnessing the power of massive amounts of data which we are currently producing every day in our digital world. In general, the process of building a high-quality machine learning model is an iterative, complex and time-consuming process that involves trying different algorithms and techniques in addition to having a good experience with effectively tuning their hyper-parameters. In particular, conducting this process efficiently requires solid knowledge and experience with the various techniques that can be employed. With the continuous and vast increase of the amount of data in our digital world, it has been acknowledged that the number of knowledgeable data scientists can not scale to address these challenges. Thus, there was a crucial need for automating the process of building good machine learning models. In the last few years, several techniques and frameworks have been introduced to tackle the challenge of automating the process of Combined Algorithm Selection and Hyper-parameter tuning (CASH) in the machine learning domain [1]. The main aim of these techniques is to reduce the role of human in the loop and fill the gap for non-expert machine learning users by playing the role of the domain expert. In this tutorial, we aim to present a comprehensive survey for the state-of-the-art efforts in tackling the CASH problem. In addition, we highlight the research work of automating the other steps of the full complex machine learning pipeline (AutoML) from data understanding till model deployment. Furthermore, we provide a comprehensive coverage for the various tools and frameworks that have been introduced in this domain. Finally, we discuss some of the research directions and open challenges that need to be addressed in order to achieve the vision and goals of the AutoML process. This tutorial is intended to benefit researchers and system designers in the broad area of machine learning. The tutorial would benefit both designers as well as users of automated and interactive machine learning systems since a survey of the current systems and an in-depth understanding will be essential for choosing the appropriate system as well as designing an effective system. This tutorial does not require any knowledge on automated machine learning techniques but basic understanding of machine learning pipeline is required. After attending this tutorial, the audience will have: - An overview of the Machine learning pipeline (10 min.). - A good understanding of the challenges of implementing efficient and high quality machine learning pipeline (10 min.). - A comprehensive review of the state-of-the-art in the domain of automated combined algorithm selection and hyperparameter tuning (25 min.). - A comprehensive review of the state-of-the-art of the centralized, distributed and interactive AutoML frameworks (25 min.). - Highlights for potential research directions to improve the state-of-the-art and support the efforts towards achieving the broad vision of AutoML (10 min.). - A demo of our prototype iSmartML1, an interactive and user-guided framework for automated machine learning (10 min.). The tutorial is timely and quite relevant for the data management and machine learning research communities due to the rapid growth in the applications of machine learning in almost every application domain. The increasing momentum for developing AutoML frameworks would enrich the discussion for potential directions to improve the usability and wide acceptance of these tools among data scientists and domain experts.
引用
收藏
页码:627 / 629
页数:3
相关论文
共 50 条
  • [1] Automated machine learning: Review of the state-of-the-art and opportunities for healthcare
    Waring, Jonathan
    Lindvall, Charlotta
    Umeton, Renato
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2020, 104
  • [2] Interpretable Machine Learning - A Brief History, State-of-the-Art and Challenges
    Molnar, Christoph
    Casalicchio, Giuseppe
    Bischl, Bernd
    [J]. ECML PKDD 2020 WORKSHOPS, 2020, 1323 : 417 - 431
  • [3] Keynote Talk -Federated Learning: The Hype, State-of-the-Art and Open Challenges
    Baracaldo, Nathalie
    [J]. PROCEEDINGS OF THE 27TH ACM SYMPOSIUM ON ACCESS CONTROL MODELS AND TECHNOLOGIES, SACMAT 2022, 2022, : 3 - 4
  • [4] Machine learning in the quantum realm: The state-of-the-art, challenges, and future vision
    Houssein, Essam H.
    Abohashima, Zainab
    Elhoseny, Mohamed
    Mohamed, Waleed M.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 194
  • [5] OPEN LEARNING - THE STATE-OF-THE-ART IN NURSING AND MIDWIFERY
    CLARK, E
    ROBINSON, K
    [J]. NURSE EDUCATION TODAY, 1994, 14 (04) : 257 - 263
  • [6] Semantic ETL - State-of-the-art and open research challenges
    Chakraborty, Jaydeep
    Padki, Aparna
    Bansal, Srividya K.
    [J]. 2017 11TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2017, : 413 - 418
  • [7] Adversarial Machine Learning: A Multilayer Review of the State-of-the-Art and Challenges for Wireless and Mobile Systems
    Liu, Jinxin
    Nogueira, Michele
    Fernandes, Johan
    Kantarci, Burak
    [J]. IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2022, 24 (01): : 123 - 159
  • [8] Heterogeneous Federated Learning: State-of-the-art and Research Challenges
    Ye, Mang
    Fang, Xiuwen
    Du, Bo
    Yuen, Pong C.
    Tao, Dacheng
    [J]. ACM COMPUTING SURVEYS, 2024, 56 (03)
  • [9] Benchmark of Automated Machine Learning with State-of-the-Art Image Segmentation Algorithms for Tool Condition Monitoring
    Lutz, B.
    Reisch, R.
    Kisskalt, D.
    Avci, B.
    Regulin, D.
    Knoll, A.
    Franke, J.
    [J]. 30TH INTERNATIONAL CONFERENCE ON FLEXIBLE AUTOMATION AND INTELLIGENT MANUFACTURING (FAIM2021), 2020, 51 : 215 - 221
  • [10] Machine Learning in Healthcare Analytics: A State-of-the-Art Review
    Das, Surajit
    Nayak, Samaleswari P.
    Sahoo, Biswajit
    Nayak, Sarat Chandra
    [J]. ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2024, 31 (07) : 3923 - 3962