Pre-Processing Methods of Data Mining

被引:0
|
作者
Saleem, Asma [1 ]
Asif, Khadim Hussain [1 ]
Ali, Ahmad [2 ]
Awan, Shahid Mahmood [3 ]
AlGhamdi, Mohammed A. [4 ]
机构
[1] Univ Engn & Technol, Dept Comp Sci & Engn, Lahore, Pakistan
[2] COMSATS Inst Informat Technol, Dept Biosci, Sahiwal, Pakistan
[3] Univ Engn & Technol, Al Khawarizmi Inst Comp Sci, Lahore, Pakistan
[4] Umm Al Qura Univ, Inst Innovat & Entrepreneurship, Mecca, Saudi Arabia
来源
2014 IEEE/ACM 7TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC) | 2014年
关键词
data pre-processing; data mining; outliers; missing values;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data generation, handling and its processing have emerged as the most reliable source of understanding and discovery of new facts, knowledge and products in the world of natural and material sciences. The emergence of the most efficient techniques in statistical or bioinformatics situations has therefore become a routine practice in research and industrial sectors. Under practical conditions, dealing with large datasets, it's likely to have inconsistencies and anomalies of all kinds to prevent to know real outcomes for practical problems. For accurate data mining computer based techniques of data pre-processing offer solutions that help the data under processing to conform normal structures which in turn considerably improve the performance of machine learning algorithms. In this process, accurate determination of outliers, extreme values and filling up gaps poses formidable challenges. Multiple methodologies have therefore been developed to detect these deviated or inconsistent values called outliers. Different data pre-processing techniques discussed in this paper could offer most suitable solutions for handling missing values and outliers in all kinds of large datasets such as electric load and weather datasets.
引用
收藏
页码:451 / 456
页数:6
相关论文
共 50 条
  • [21] A survey on pre-processing techniques: Relevant issues in the context of environmental data mining
    Gibert, Karina
    Sanchez-Marre, Miquel
    Izquierdo, Joaquin
    AI COMMUNICATIONS, 2016, 29 (06) : 627 - 663
  • [22] Efficient Management of Web Data by Applying Web Mining Pre-processing Methodologies
    Kaur, Jaswinder
    Garg, Kanwal
    SOFTWARE ENGINEERING (CSI 2015), 2019, 731 : 115 - 122
  • [23] On the existence and significance of data pre-processing biases in web-usage mining
    Zheng, ZQ
    Padmanabhan, B
    Kimbrough, SO
    INFORMS JOURNAL ON COMPUTING, 2003, 15 (02) : 148 - 170
  • [24] Data mining algorithm for pre-processing biopharmaceutical drug product manufacturing records
    Casola, Gioele
    Siegmund, Christian
    Mattern, Markus
    Sugiyama, Hirokazu
    COMPUTERS & CHEMICAL ENGINEERING, 2019, 124 : 253 - 269
  • [25] On Pre-processing Algorithms for Data Stream
    Duda, Piotr
    Jaworski, Maciej
    Pietruczuk, Lena
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT II, 2012, 7268 : 56 - 63
  • [26] Kurtosis removal for data pre-processing
    Loperfido, Nicola
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2023, 17 (01) : 239 - 267
  • [27] Intelligent assistance for data pre-processing
    Bilalli, Besim
    Abello, Alberto
    Aluja-Banet, Tomas
    Wrembel, Robert
    COMPUTER STANDARDS & INTERFACES, 2018, 57 : 101 - 109
  • [28] A NEW METHOD FOR DATA PRE-PROCESSING
    RAISINGHANI, SC
    BILIMORIA, KD
    JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 1984, 7 (02) : 255 - 256
  • [29] Kurtosis removal for data pre-processing
    Nicola Loperfido
    Advances in Data Analysis and Classification, 2023, 17 : 239 - 267
  • [30] Pre-processing VDIF Data in FPGA
    Gan, Jiangying
    Xu, Zhijun
    2018 PROGRESS IN ELECTROMAGNETICS RESEARCH SYMPOSIUM (PIERS-TOYAMA), 2018, : 723 - 728