Textual data dimensionality reduction-a deep learning approach

被引:5
|
作者
Kushwaha, Neetu [1 ]
Pant, Millie [1 ]
机构
[1] Indian Inst Technol Roorkee, Dept ASE, Roorkee 247667, Uttar Pradesh, India
关键词
Autoencoder; Clustering; Feature extraction; Dimensionality reduction; INTEGRATING FEATURE-SELECTION; BPSO;
D O I
10.1007/s11042-018-6900-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The growth of Internet has produced a high volume of natural language textual data. Such data can be sparse and may contain uninformative features which increase the dimensions of the data. This high dimensionality in turn, decreases the efficiency of text mining tasks such as clustering. Transforming the high dimensional data into a lower dimension is an important pre-processing step before applying clustering. In this paper, dimensionality reduction method based on deep Autoencoder neural network named as DRDAE, is proposed to provide optimized and robust features for text clustering. DRDAE selects less correlated and salient feature space from the high dimensional feature space. To evaluate proposed algorithm, k-means is used to cluster text documents. The proposed method is tested on five benchmark text datasets. Simulation results demonstrate that the proposed algorithm clearly outperforms other conventional dimensionality reduction methods in the literature in terms of RI measure.
引用
收藏
页码:11039 / 11050
页数:12
相关论文
共 50 条
  • [1] Microblog Dimensionality Reduction-A Deep Learning Approach
    Xu, Lei
    Jiang, Chunxiao
    Ren, Yong
    Chen, Hsiao-Hwa
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (07) : 1779 - 1789
  • [2] Textual data dimensionality reduction - a deep learning approach
    Neetu Kushwaha
    Millie Pant
    [J]. Multimedia Tools and Applications, 2020, 79 : 11039 - 11050
  • [3] Dimensionality Reduction in Data Summarization Approach to Learning Relational Data
    Kheau, Chung Seng
    Alfred, Rayner
    Keng, Lau Hui
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2013), PT I,, 2013, 7802 : 166 - 175
  • [4] Data Imputation and Dimensionality Reduction Using Deep Learning in Industrial Data
    Zhou, Zhihong
    Mo, Jiao
    Shi, Yijie
    [J]. PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 2329 - 2333
  • [5] A manifold learning approach to dimensionality reduction for modeling data
    Turchetti, Claudio
    Falaschetti, Laura
    [J]. INFORMATION SCIENCES, 2019, 491 : 16 - 29
  • [6] An efficient approach for textual data classification using deep learning
    Alqahtani, Abdullah
    Khan, Habib Ullah
    Alsubai, Shtwai
    Sha, Mohemmed
    Almadhor, Ahmad
    Iqbal, Tayyab
    Abbas, Sidra
    [J]. FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2022, 16
  • [7] Deep learning approach based on dimensionality reduction for designing electromagnetic nanostructures
    Yashar Kiarashinejad
    Sajjad Abdollahramezani
    Ali Adibi
    [J]. npj Computational Materials, 6
  • [8] Deep learning approach based on dimensionality reduction for designing electromagnetic nanostructures
    Kiarashinejad, Yashar
    Abdollahramezani, Sajjad
    Adibi, Ali
    [J]. NPJ COMPUTATIONAL MATERIALS, 2020, 6 (01)
  • [9] Automated English Speech Recognition Using Dimensionality Reduction with Deep Learning Approach
    Yu, Jing
    Ye, Nianhua
    Du, Xueqin
    Han, Lu
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [10] Dimensionality Reduction Approach for Genotypic Data
    Al-Husain, Luluah
    Hafez, Alaaeldin M.
    [J]. 2015 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (CIBCB), 2015, : 202 - 206