Automated machine learning tool: The first stop for data science and statistical model building

被引:0
|
作者
Gopagoni D. [1 ]
Lakshmi P.V. [1 ]
机构
[1] Department of Computer Science and Engineering, GIT GITAM (Deemed to be University), Vishakhapatnam, Andhra Pradesh
关键词
Artificial neural networks; Automated machine learning; Drug design; K-means clustering; Market analysis; Naive bayes classification; QSAR; QSPR; R program; Regression models; Shiny web app; Supervised learning; Support vector machines;
D O I
10.14569/ijacsa.2020.0110253
中图分类号
学科分类号
摘要
Machine learning techniques are designed to derive knowledge out of existing data. Increased computational power, use of natural language processing, image processing methods made easy creation of rich data. Good domain knowledge is required to build useful models. Uncertainty remains around choosing the right sample data, variables reduction and selection of statistical algorithm. A suitable statistical method coupled with explaining variables is critical for model building and analysis. There are multiple choices around each parameter. An automated system which could help the scientists to select an appropriate data set coupled with learning algorithm will be very useful. A freely available web-based platform, named automated machine learning tool (AMLT), is developed in this study. AMLT will automate the entire model building process. AMLT is equipped with all most commonly used variable selection methods, statistical methods both for supervised and unsupervised learning. AMLT can also do the clustering. AMLT uses statistical principles like R2 to rank the models and automatic test set validation. Tool is validated for connectivity and capability by reproducing two published works. © Science and Information Organization.
引用
收藏
页码:410 / 418
页数:8
相关论文
共 50 条