Pre-Processing Methods of Data Mining

被引:0
|
作者
Saleem, Asma [1 ]
Asif, Khadim Hussain [1 ]
Ali, Ahmad [2 ]
Awan, Shahid Mahmood [3 ]
AlGhamdi, Mohammed A. [4 ]
机构
[1] Univ Engn & Technol, Dept Comp Sci & Engn, Lahore, Pakistan
[2] COMSATS Inst Informat Technol, Dept Biosci, Sahiwal, Pakistan
[3] Univ Engn & Technol, Al Khawarizmi Inst Comp Sci, Lahore, Pakistan
[4] Umm Al Qura Univ, Inst Innovat & Entrepreneurship, Mecca, Saudi Arabia
来源
2014 IEEE/ACM 7TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC) | 2014年
关键词
data pre-processing; data mining; outliers; missing values;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data generation, handling and its processing have emerged as the most reliable source of understanding and discovery of new facts, knowledge and products in the world of natural and material sciences. The emergence of the most efficient techniques in statistical or bioinformatics situations has therefore become a routine practice in research and industrial sectors. Under practical conditions, dealing with large datasets, it's likely to have inconsistencies and anomalies of all kinds to prevent to know real outcomes for practical problems. For accurate data mining computer based techniques of data pre-processing offer solutions that help the data under processing to conform normal structures which in turn considerably improve the performance of machine learning algorithms. In this process, accurate determination of outliers, extreme values and filling up gaps poses formidable challenges. Multiple methodologies have therefore been developed to detect these deviated or inconsistent values called outliers. Different data pre-processing techniques discussed in this paper could offer most suitable solutions for handling missing values and outliers in all kinds of large datasets such as electric load and weather datasets.
引用
收藏
页码:451 / 456
页数:6
相关论文
共 50 条
  • [41] Pre-processing methods for handwritten arabic documents
    Farooq, F
    Govindaraju, V
    Perrone, M
    EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 267 - 271
  • [42] Advances in Data Pre-Processing Methods for Distributed Fiber Optic Strain Sensing
    Richter, Bertram
    Ulbrich, Lisa
    Herbers, Max
    Marx, Steffen
    SENSORS, 2024, 24 (23)
  • [43] Evaluation of pre-processing methods for the prediction of cattle behaviour from accelerometer data
    Riaboff, L.
    Aubin, S.
    Bedere, N.
    Couvreur, S.
    Madouasse, A.
    Goumand, E.
    Chauvin, A.
    Plantier, G.
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2019, 165
  • [44] Effective Pre-processing Methods with DTG Big Data by Using MapReduce Techniques
    Cho, Wonhee
    Choi, Eunmi
    ADVANCES IN COMPUTER SCIENCE AND UBIQUITOUS COMPUTING, 2017, 421 : 389 - 395
  • [45] Optimisation of mobile intelligent terminal data pre-processing methods for crowd sensing
    Huang, Min
    Zeng, Yuefan
    Chen, Lina
    Sun, Bo
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2018, 3 (02) : 101 - 113
  • [46] METHODS OF PRE-PROCESSING TEXT DATA IN THE TASK OF ANALYZING THE EMOTIONAL STATE OF USERS
    Savenkov, Pavel Anatolyevich
    Voloshko, Anna Gennadievna
    Ivutin, Alexey Nikolaevich
    PROCEEDINGS OF THE TULA STATES UNIVERSITY-SCIENCES OF EARTH, 2024, 2
  • [47] Improved Dequantization and Normalization Methods for Tabular Data Pre-Processing in Smart Buildings
    Das, Hari Prasanna
    Spanos, Costas J.
    PROCEEDINGS OF THE 2022 THE 9TH ACM INTERNATIONAL CONFERENCE ON SYSTEMS FOR ENERGY-EFFICIENT BUILDINGS, CITIES, AND TRANSPORTATION, BUILDSYS 2022, 2022, : 168 - 177
  • [48] Interdependencies in data pre-processing, training methods and neural network topology generation
    Rudolph, S
    Brückner, S
    APPLICATIONS AND SCIENCE OF COMPUTATIONAL INTELLIGENCE V, 2002, 4739 : 98 - 107
  • [49] Influence of Pre-Processing of Data Within the Performance of the Profiling Models of Clients Developed with Tools of Data Mining
    Fuentes Alarcon, Paulo Cesar
    Rojas Martinez, Sandra Liliana
    2016 IEEE 11TH COLOMBIAN COMPUTING CONFERENCE (CCC), 2016,
  • [50] Online calibration and pre-processing of TAMA data
    Tatsumi, D
    Tsunesada, Y
    CLASSICAL AND QUANTUM GRAVITY, 2004, 21 (05) : S451 - S456