Automated machine learning tool: The first stop for data science and statistical model building

被引:0
|
作者
Gopagoni D. [1 ]
Lakshmi P.V. [1 ]
机构
[1] Department of Computer Science and Engineering, GIT GITAM (Deemed to be University), Vishakhapatnam, Andhra Pradesh
关键词
Artificial neural networks; Automated machine learning; Drug design; K-means clustering; Market analysis; Naive bayes classification; QSAR; QSPR; R program; Regression models; Shiny web app; Supervised learning; Support vector machines;
D O I
10.14569/ijacsa.2020.0110253
中图分类号
学科分类号
摘要
Machine learning techniques are designed to derive knowledge out of existing data. Increased computational power, use of natural language processing, image processing methods made easy creation of rich data. Good domain knowledge is required to build useful models. Uncertainty remains around choosing the right sample data, variables reduction and selection of statistical algorithm. A suitable statistical method coupled with explaining variables is critical for model building and analysis. There are multiple choices around each parameter. An automated system which could help the scientists to select an appropriate data set coupled with learning algorithm will be very useful. A freely available web-based platform, named automated machine learning tool (AMLT), is developed in this study. AMLT will automate the entire model building process. AMLT is equipped with all most commonly used variable selection methods, statistical methods both for supervised and unsupervised learning. AMLT can also do the clustering. AMLT uses statistical principles like R2 to rank the models and automatic test set validation. Tool is validated for connectivity and capability by reproducing two published works. © Science and Information Organization.
引用
收藏
页码:410 / 418
页数:8
相关论文
共 50 条
  • [31] Statistical Evaluation of Machine Learning for Vibration Data
    Myren, Samuel
    Parikh, Nidhi
    Flynn, Garrison
    Higdon, Dave
    Casleton, Emily
    DATA SCIENCE IN ENGINEERING, VOL. 10, IMAC 2024, 2025, : 7 - 18
  • [32] Chemical SuperLearner (ChemSL)- An automated machine learning framework for building physical and chemical properties model
    Mohan, Balaji
    Chang, Junseok
    CHEMICAL ENGINEERING SCIENCE, 2024, 294
  • [33] Data Science and Machine Learning: Mathematical and Statistical Methods (vol 49, pg 2094, 2020)
    Rockloev, Joacim
    Gayle, Albert A.
    INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2020, 49 (06) : 2096 - 2096
  • [34] Building automated survey coders via interactive machine learning
    Moreo, Alejandro
    Esuli, Andrea
    Sebastiani, Fabrizio
    INTERNATIONAL JOURNAL OF MARKET RESEARCH, 2019, 61 (04) : 408 - 429
  • [35] Machine learning for data mining, data science and data analytics
    Radhakrishna, Vangipuram
    Reddy, Gali Suresh
    Kumar, Gunupudi Rajesh
    Rao, Dammavalam Srinivasa
    Recent Advances in Computer Science and Communications, 2021, 14 (05): : 1356 - 1357
  • [36] Encoding dissimilarity data for statistical model building
    Wahba, Grace
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2010, 140 (12) : 3580 - 3596
  • [37] Effective Data Science Leadership Based on Text Mining and Machine Learning Model
    Sun, Yuandong
    Zhao, Xinyue
    APPLICATIONS OF DECISION SCIENCE IN MANAGEMENT, ICDSM 2022, 2023, 260 : 181 - 193
  • [38] Machine Learning: Deepest Learning as Statistical Data Assimilation Problems
    Abarbanel, Henry D., I
    Rozdeba, Paul J.
    Shirman, Sasha
    NEURAL COMPUTATION, 2018, 30 (08) : 2025 - 2055
  • [39] Automated Machine Learning Tool for Mechanical- and Plant Engineering
    Kalla, Horst
    ATP MAGAZINE, 2019, (6-7): : 28 - 29
  • [40] Nursing Orientation to Data Science and Machine Learning
    O'Brien, Roxanne L.
    O'Brien, Matt W.
    AMERICAN JOURNAL OF NURSING, 2021, 121 (04) : 32 - 39