Pre-Processing Methods of Data Mining

被引:0
|
作者
Saleem, Asma [1 ]
Asif, Khadim Hussain [1 ]
Ali, Ahmad [2 ]
Awan, Shahid Mahmood [3 ]
AlGhamdi, Mohammed A. [4 ]
机构
[1] Univ Engn & Technol, Dept Comp Sci & Engn, Lahore, Pakistan
[2] COMSATS Inst Informat Technol, Dept Biosci, Sahiwal, Pakistan
[3] Univ Engn & Technol, Al Khawarizmi Inst Comp Sci, Lahore, Pakistan
[4] Umm Al Qura Univ, Inst Innovat & Entrepreneurship, Mecca, Saudi Arabia
来源
2014 IEEE/ACM 7TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC) | 2014年
关键词
data pre-processing; data mining; outliers; missing values;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data generation, handling and its processing have emerged as the most reliable source of understanding and discovery of new facts, knowledge and products in the world of natural and material sciences. The emergence of the most efficient techniques in statistical or bioinformatics situations has therefore become a routine practice in research and industrial sectors. Under practical conditions, dealing with large datasets, it's likely to have inconsistencies and anomalies of all kinds to prevent to know real outcomes for practical problems. For accurate data mining computer based techniques of data pre-processing offer solutions that help the data under processing to conform normal structures which in turn considerably improve the performance of machine learning algorithms. In this process, accurate determination of outliers, extreme values and filling up gaps poses formidable challenges. Multiple methodologies have therefore been developed to detect these deviated or inconsistent values called outliers. Different data pre-processing techniques discussed in this paper could offer most suitable solutions for handling missing values and outliers in all kinds of large datasets such as electric load and weather datasets.
引用
收藏
页码:451 / 456
页数:6
相关论文
共 50 条
  • [31] PRE-PROCESSING OF DATA FOR CHARACTER RECOGNITION
    ALCORN, TM
    HOGGAR, CW
    MARCONI REVIEW, 1969, 32 (172): : 61 - &
  • [32] Pre-processing Agilent microarray data
    Zahurak, Marianna
    Parmigiani, Giovanni
    Yu, Wayne
    Scharpf, Robert B.
    Berman, David
    Schaeffer, Edward
    Shabbeer, Shabana
    Cope, Leslie
    BMC BIOINFORMATICS, 2007, 8 (1)
  • [33] Pre-processing Agilent microarray data
    Marianna Zahurak
    Giovanni Parmigiani
    Wayne Yu
    Robert B Scharpf
    David Berman
    Edward Schaeffer
    Shabana Shabbeer
    Leslie Cope
    BMC Bioinformatics, 8
  • [34] PRESISTANT: Data Pre-processing Assistant
    Bilalli, Besim
    Abello, Alberto
    Aluja-Banet, Tomas
    Munir, Rana Faisal
    Wrembel, Robert
    INFORMATION SYSTEMS IN THE BIG DATA ERA, 2018, 317 : 57 - 65
  • [35] Big data pre-processing methods with vehicle driving data using MapReduce techniques
    Cho, Wonhee
    Choi, Eunmi
    JOURNAL OF SUPERCOMPUTING, 2017, 73 (07): : 3179 - 3195
  • [36] Optimization of data pre-processing methods for time-series classification of electroencephalography data
    Anders, Christoph
    Curio, Gabriel
    Arnrich, Bert
    Waterstraat, Gunnar
    NETWORK-COMPUTATION IN NEURAL SYSTEMS, 2023, 34 (04) : 374 - 391
  • [37] Big data pre-processing methods with vehicle driving data using MapReduce techniques
    Wonhee Cho
    Eunmi Choi
    The Journal of Supercomputing, 2017, 73 : 3179 - 3195
  • [38] Pre-Processing of Query Logs in Web Usage Mining
    Abdullah, Norhaiza Ya
    Husin, Husna Sarirah
    Ramadhani, Herny
    Nadarajan, Shanmuga Vivekanada
    INDUSTRIAL ENGINEERING AND MANAGEMENT SYSTEMS, 2012, 11 (01): : 82 - 86
  • [39] Using data mining techniques for detecting noises and pre-processing financial time series
    Leung, CKS
    Thulasiram, RK
    Bondarenko, DA
    PROCEEDINGS OF THE 8TH JOINT CONFERENCE ON INFORMATION SCIENCES, VOLS 1-3, 2005, : 1138 - 1141
  • [40] LSSVM with fuzzy pre-processing model based aero engine data mining technology
    Wang, Xuhui
    Huang, Shengguo
    Cao, Li
    Shi, Dinghao
    Shu, Ping
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2007, 4632 : 100 - +