MapReduce Distributed Highly Random Fuzzy Forest for Noisy Big Data

被引:0
|
作者
Mustafic, Faruk [1 ]
Xiong, Ning [1 ]
Herera, Francisco [2 ]
Gallego, Sergio Ramrez [2 ]
机构
[1] Malardalen Univ, Hgsk Pl 1, S-72123 Vasteras, Sweden
[2] Univ Granada, Ave Hosp,S-N, Granada 18010, Spain
基金
瑞典研究理事会;
关键词
random forest; fuzzy decision tree; highly random fuzzy forest; noisy Big Data; attribute noise; CLASSIFICATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays the amounts of data available to us have the ever larger growth trend. On the other hand such data often contain noise. We call them noisy Big Data. There is an increasing need for learning methods that can handle such noisy Big Data for classification tasks. In this paper we propose a highly random fuzzy forest algorithm for learning an ensemble of fuzzy decision trees from a big data set contaminated with attribute noise. We also present the distributed version of the proposed learning algorithm implemented in the MapReduce framework. Experiment results have demonstrated that the proposed algorithm is faster and more accurate than the state-of-the-art approach particularly in the presence of attribute noise.
引用
收藏
页码:560 / 567
页数:8
相关论文
共 50 条
  • [21] Fuzzy rule based classification systems for big data with MapReduce: granularity analysis
    Fernandez, Alberto
    del Rio, Sara
    Bawakid, Abdullah
    Herrera, Francisco
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2017, 11 (04) : 711 - 730
  • [22] Spatiotemporal data partitioning for distributed random forest algorithm: Air quality prediction using imbalanced big spatiotemporal data on spark distributed framework
    Asgari, Marjan
    Yang, Wanhong
    Farnaghi, Mahdi
    ENVIRONMENTAL TECHNOLOGY & INNOVATION, 2022, 27
  • [23] Probabilistic Random Forest: A Machine Learning Algorithm for Noisy Data Sets
    Reis, Itamar
    Baron, Dalya
    Shahaf, Sahar
    ASTRONOMICAL JOURNAL, 2019, 157 (01):
  • [24] Distributed Random Forest for Predicting Forest Wildfires Based on Weather Data
    Damasevisius, Robertas
    Maskeliunas, Rytis
    ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2023, PT II, 2024, 2091 : 305 - 320
  • [25] Random Sample Partition: A Distributed Data Model for Big Data Analysis
    Salloum, Salman
    Huan, Joshua Zhexue
    He, Yulin
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2019, 15 (11) : 5846 - 5854
  • [26] MapReduce: Simplified Data Analysis of Big Data
    Maitrey, Seema
    Jha, C. K.
    3RD INTERNATIONAL CONFERENCE ON RECENT TRENDS IN COMPUTING 2015 (ICRTC-2015), 2015, 57 : 563 - 571
  • [27] Distributed Pattern Matching and Document Analysis in Big Data using Hadoop MapReduce Model
    Ramya, A., V
    Sivasankar, E.
    2014 INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2014, : 312 - 317
  • [28] Distributed SPARQL over Big RDF Data A Comparative Analysis using Presto and MapReduce
    Mammo, Mulugeta
    Bansal, Srividya K.
    2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 33 - 40
  • [29] Hybrid Parallel Linguistic Fuzzy Rules with Canopy MapReduce for Big Data Classification in Cloud
    Vennila, V.
    Kannan, A. Rajiv
    INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2019, 21 (03) : 809 - 822
  • [30] On the use of MapReduce to build Linguistic Fuzzy Rule Based Classification Systems for Big Data
    Lopez, Victoria
    del Rio, Sara
    Manuel Benitez, Jose
    Herrera, Francisco
    2014 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2014, : 1905 - 1912