Analysis of Feature Selection and Extraction Algorithm for Loan Data: A Big Data Approach

被引:0
|
作者
Attigeri, Girija [1 ]
Pai, Manohara M. M. [1 ]
Pai, Radhika M. [1 ]
机构
[1] Manipal Univ, Dept Informat & Commun Technol, Manipal Inst Technol, Manipal, Karnataka, India
关键词
Classification; Financial big data; Feature selection and extraction; Support Vector Machine; Logistic regression;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Fraudulent activities in financial institutes can break the economic system of the country. These activities can be identified using clustering and classification algorithms. Effectiveness of these algorithms depend on quality of the input data. Moreover, financial data comes from various sources and forms such as financial statements, stakeholders activities and others. This data from various sources is very vast and unstructured big data. Hence, parallel distributed pre-processing is very significant to improve the quality of the data. Objective of this work is dimensionality reduction considering feature selection and extraction algorithm for large volume of financial data. In this paper an attempt is made to understand the implications of feature extraction and transformation algorithm using Principal Feature Analysis on the financial data. Effect of reduced dimension is studied on various classification algorithms for financial loan data. Parallel and distributed implementation is carried out on IBM Bluemix cloud platform with spark notebook. The results show that reduction of features has significantly improved execution time without compromising the accuracy.
引用
收藏
页码:2147 / 2151
页数:5
相关论文
共 50 条
  • [1] Reducing Data Complexity in Feature Extraction and Feature Selection for Big Data Security Analytics
    Sisiaridis, Dimitrios
    Markowitch, Olivier
    [J]. 2018 1ST INTERNATIONAL CONFERENCE ON DATA INTELLIGENCE AND SECURITY (ICDIS 2018), 2018, : 43 - 48
  • [2] Image feature extraction algorithm in big data environment
    Zhang, Yubao
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (04) : 5109 - 5118
  • [3] A New Approach for Wrapper Feature Selection Using Genetic Algorithm for Big Data
    Bouaguel, Waad
    [J]. INTELLIGENT AND EVOLUTIONARY SYSTEMS, IES 2015, 2016, 5 : 75 - 83
  • [4] Feature Selection Using Genetic Algorithm for Big Data
    Saidi, Rania
    Ncir, Waad Bouaguel
    Essoussi, Nadia
    [J]. INTERNATIONAL CONFERENCE ON ADVANCED MACHINE LEARNING TECHNOLOGIES AND APPLICATIONS (AMLTA2018), 2018, 723 : 352 - 361
  • [5] Automating Feature Extraction and Feature Selection in Big Data Security Analytics
    Sisiaridis, Dimitrios
    Markowitch, Olivier
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING (ICAISC 2018), PT II, 2018, 10842 : 423 - 432
  • [6] An Efficient Parallel Hybrid Feature Selection Approach for Big Data Analysis
    Azaiz, Mohamed Amine
    Bensaber, Djamel Amar
    [J]. INTERNATIONAL JOURNAL OF SWARM INTELLIGENCE RESEARCH, 2022, 13 (01)
  • [7] An online approach for feature selection for classification in big data
    Nazar, Nasrin Banu
    Senthilkumar, Radha
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2017, 25 (01) : 163 - 171
  • [8] Hybrid Approach of SVM and Feature Selection Based Optimization Algorithm for Big Data Security
    Duhan, Bharti
    Dhankhar, Neetu
    [J]. PROCEEDINGS OF ICETIT 2019: EMERGING TRENDS IN INFORMATION TECHNOLOGY, 2020, 605 : 694 - 706
  • [9] A greedy feature selection algorithm for Big Data of high dimensionality
    Tsamardinos, Ioannis
    Borboudakis, Giorgos
    Katsogridakis, Pavlos
    Pratikakis, Polyvios
    Christophides, Vassilis
    [J]. MACHINE LEARNING, 2019, 108 (02) : 149 - 202
  • [10] An ACO–ANN based feature selection algorithm for big data
    R. Joseph Manoj
    M. D. Anto Praveena
    K. Vijayakumar
    [J]. Cluster Computing, 2019, 22 : 3953 - 3960