Parallel Random Forest with IPython']Python Cluster

被引:0
|
作者
Limprasert, Wasit [1 ]
机构
[1] Thammasat Univ, Dept Comp Sci, Fac Sci & Technol, Pathum Thani, Thailand
关键词
Parallel Algorithm; Random Forest; I[!text type='Python']Python[!/text; CLASSIFICATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
recently research studies require analytic tools capable to interpret patterns and find hidden knowledge from huge amount of data. Random Forest, an ensemble-tree classifier based on bagging method, is one of many well-known classifiers to find hidden model from data. The classifier has been applied to recognize various kind of data, e.g. human pose from depth images, plankton images and time-series pattern analysis. In this paper, an implementation of optimized parallel Random Forest has been designed and implemented on IPython, which is an interactive Python with parallelization functionalities and convenient to be deployed in most of computing platforms. The implementation shows 80% of CPU utilization when performing a training of 10(7) samples in 12hrs on EC2 cluster with 32 cores. This implementation shows capability to analyses large amount of data.
引用
收藏
页码:62 / 67
页数:6
相关论文
共 50 条
  • [41] Infrared Image Super-Resolution with Parallel Random Forest
    Xiaomin Yang
    Wei Wu
    Binyu Yan
    Huiqian Wang
    Kai Zhou
    Kai Liu
    International Journal of Parallel Programming, 2018, 46 : 838 - 858
  • [42] Optimization of parallel random forest algorithm based on distance weight
    Wang, Qinge
    Chen, Huihua
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) : 1951 - 1963
  • [43] QUANTITY FORECAST OF ADMINISTRATIVE ITEMS BASED ON PARALLEL RANDOM FOREST
    Zhong, Linxia
    Wan, Wanggen
    Luo, Ziyue
    Zhang, Xiaodong
    4TH INTERNATIONAL CONFERENCE ON SMART AND SUSTAINABLE CITY (ICSSC 2017), 2017, : 51 - 55
  • [44] Infrared Image Super-Resolution with Parallel Random Forest
    Yang, Xiaomin
    Wu, Wei
    Yan, Binyu
    Wang, Huiqian
    Zhou, Kai
    Liu, Kai
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2018, 46 (05) : 838 - 858
  • [45] Wind Power Forecasting Using Parallel Random Forest Algorithm
    Natarajan, V. Anantha
    Kumari, N. Sandhya
    SOFT COMPUTING FOR PROBLEM SOLVING, SOCPROS 2018, VOL 1, 2020, 1048 : 209 - 224
  • [46] DistForest: A Parallel Random Forest Training Framework based on Supercomputer
    Wang, Chenxu
    Cai, Tingting
    Suo, Guang
    Lu, Yutong
    Zhou, Enqiang
    IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 196 - 204
  • [47] Quantity forecast of administrative items based on parallel random forest
    Zhong, Linxia
    Wan, Wanggen
    Luo, Ziyue
    Zhang, Xiaodong
    4th International Conference on Smart and Sustainable City, ICSSC 2017, 2017, 2018-January
  • [48] Study on the use of a combination of IPython']Python Notebook and an industry-standard package in educating a CFD course
    Seddighi, Mehdi
    Allanson, David
    Rothwell, Glynn
    Takrouri, Khaled
    COMPUTER APPLICATIONS IN ENGINEERING EDUCATION, 2020, 28 (04) : 952 - 964
  • [49] Teaching Parallel Computing and Dependence Analysis with Python']Python
    Watkinson, Neftali
    Shivam, Aniket
    Nicolau, Alexandru
    Veidenbaum, Alexander V.
    2019 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2019, : 320 - 325
  • [50] A Python']Python upgrade to the GooFit package for parallel fitting
    Schreiner, Henry
    Pandey, Himadri
    Sokoloff, Michael D.
    Hittle, Bradley
    Tomko, Karen
    Hasse, Christoph
    23RD INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP 2018), 2019, 214