Detection of outliers in high-dimensional data using nu-support vector regression

被引:10
|
作者
Mohammed Rashid, Abdullah [1 ]
Midi, Habshah [1 ,2 ]
Dhhan, Waleed [3 ,4 ]
Arasan, Jayanthi [2 ]
机构
[1] Univ Putra Malaysia, Inst Math Res, Serdang 43400, Malaysia
[2] Univ Putra Malaysia, Dept Math, Fac Sci, Serdang, Malaysia
[3] Nawroz Univ NZU, Ctr Sci Res, Duhok, Iraq
[4] Babylon Governorate, Babylon Housing Dept, Babylon, Iraq
关键词
High-dimensional data; outliers; robustness; statistical learning theory; support vector regression; HIGH LEVERAGE POINTS; IDENTIFICATION;
D O I
10.1080/02664763.2021.1911965
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Support Vector Regression (SVR) is gaining in popularity in the detection of outliers and classification problems in high-dimensional data (HDD) as this technique does not require the data to be of full rank. In real application, most of the data are of high dimensional. Classification of high-dimensional data is needed in applied sciences, in particular, as it is important to discriminate cancerous cells from non-cancerous cells. It is also imperative that outliers are identified before constructing a model on the relationship between the dependent and independent variables to avoid misleading interpretations about the fitting of a model. The standard SVR and the mu-epsilon-SVR are able to detect outliers; however, they are computationally expensive. The fixed parameters support vector regression (FP-epsilon-SVR) was put forward to remedy this issue. However, the FP-epsilon-SVR using epsilon-SVR is not very successful in identifying outliers. In this article, we propose an alternative method to detect outliers i.e. by employing nu-SVR. The merit of our proposed method is confirmed by three real examples and the Monte Carlo simulation. The results show that our proposed nu-SVR method is very successful in identifying outliers under a variety of situations, and with less computational running time.
引用
收藏
页码:2550 / 2569
页数:20
相关论文
共 50 条
  • [1] An Incremental Dual nu-Support Vector Regression Algorithm
    Yu, Hang
    Lu, Jie
    Zhang, Guangquan
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2018, PT II, 2018, 10938 : 520 - 531
  • [2] An Efficient Estimation and Classification Methods for High Dimensional Data Using Robust Iteratively Reweighted SIMPLS Algorithm Based on nu-Support Vector Regression
    Rashid, Abdullah Mohammed
    Midi, Habshah
    Slwabi, Waleed Dhhan
    Arasan, Jayanthi
    [J]. IEEE ACCESS, 2021, 9 : 45955 - 45967
  • [3] Multiple outliers detection in sparse high-dimensional regression
    Wang, Tao
    Li, Qun
    Chen, Bin
    Li, Zhonghua
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2018, 88 (01) : 89 - 107
  • [4] Grid resource prediction approach based on Nu-Support Vector Regression
    Che, Xi-Long
    Hu, Liang
    [J]. PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 778 - +
  • [5] Applicability of a Nu-Support Vector Regression Model for the Completion of Missing Data in Hydrological Time Series
    Langhammer, Jakub
    Cesak, Julius
    [J]. WATER, 2016, 8 (12)
  • [6] Parameter selection in time series prediction based on nu-support vector regression
    胡亮
    [J]. High Technology Letters, 2009, 15 (04) : 337 - 342
  • [7] IMPROVED nu-SUPPORT VECTOR REGRESSION ALGORITHM BASED ON THE PRINCIPAL COMPONENT ANALYSIS
    Rashid, Abdullah Mohammed
    Midi, Habshah
    [J]. ECONOMIC COMPUTATION AND ECONOMIC CYBERNETICS STUDIES AND RESEARCH, 2023, 57 (02): : 41 - 56
  • [8] An Enhanced MEMS Error Modeling Approach Based on Nu-Support Vector Regression
    Bhatt, Deepak
    Aggarwal, Priyanka
    Bhattacharya, Prabir
    Devabhaktuni, Vijay
    [J]. SENSORS, 2012, 12 (07) : 9448 - 9466
  • [9] A Nu-support vector regression based system for grid resource monitoring and prediction
    Hu, Liang
    Che, Xi-Long
    [J]. Zidonghua Xuebao/ Acta Automatica Sinica, 2010, 36 (01): : 139 - 146
  • [10] Cluster PCA for outliers detection in high-dimensional data
    Stefatos, George
    Ben Hamza, A.
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 3961 - 3966