Denoising Autoencoder as an Effective Dimensionality Reduction and Clustering of Text Data

被引:12
|
作者
Leyli-Abadi, Milad [1 ]
Labiod, Lazhar [1 ]
Nadif, Mohamed [1 ]
机构
[1] Paris Descartes Univ, LIPADE, F-75006 Paris, France
关键词
Auto-encoder; Deep learning; Cosine similarity; Neighborhood; Document clustering; Unsupervised learning; Dimensionality reduction; FRAMEWORK;
D O I
10.1007/978-3-319-57529-2_62
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning methods are widely used in vision and face recognition, however there is a real lack of application of such methods in the field of text data. In this context, the data is often represented by a sparse high dimensional document-term matrix. Dealing with such data matrices, we present, in this paper, a new denoising auto-encoder for dimensionality reduction, where each document is not only affected by its own information, but also affected by the information from its neighbors according to the cosine similarity measure. It turns out that the proposed auto-encoder can discover the low dimensional embeddings, and as a result reveal the underlying effective manifold structure. The visual representation of these embeddings suggests the suitability of performing the clustering on the set of documents relying on the Expectation-Maximization algorithm for Gaussian mixture models. On real-world datasets, the relevance of the presented auto-encoder in the visualisation and document clustering field is shown by a comparison with five widely used unsupervised dimensionality reduction methods including the classic auto-encoder.
引用
收藏
页码:801 / 813
页数:13
相关论文
共 50 条
  • [1] Denoising and dimensionality reduction of genomic data
    Capobianco, E
    FLUCTUATIONS AND NOISE IN BIOLOGICAL, BIOPHYSICAL, AND BIOMEDICAL SYSTEMS III, 2005, 5841 : 69 - 80
  • [2] Dimensionality Reduction of Single-Cell RNA Sequencing Data by Combining Entropy and Denoising AutoEncoder
    Zhu, Xiaoshu
    Li, Jian
    Lin, Yongchang
    Zhao, Liquan
    Wang, Jianxin
    Peng, Xiaoqing
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2022, 29 (10) : 1074 - 1084
  • [3] An Autoencoder-Based Dimensionality Reduction Algorithm for Intelligent Clustering of Mineral Deposit Data
    Li, Yan
    Luo, Xiong
    Chen, Maojian
    Zhu, Yueqin
    Gao, Yang
    PROCEEDINGS OF 2019 CHINESE INTELLIGENT AUTOMATION CONFERENCE, 2020, 586 : 408 - 415
  • [4] Classification model of electricity consumption behavior based on sparse denoising autoencoder feature dimensionality reduction and spectral clustering
    Huang, Yifan
    Yao, Zhengnan
    Xu, Qifeng
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2024, 158
  • [5] ScDA: A Denoising AutoEncoder Based Dimensionality Reduction for Single-cell RNA-seq Data
    Zhu, Xiaoshu
    Lin, Yongchang
    Li, Jian
    Wang, Jianxin
    Peng, Xiaoqing
    BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2021, 2021, 13064 : 534 - 545
  • [6] Combatting Adversarial Attacks through Denoising and Dimensionality Reduction: A Cascaded Autoencoder Approach
    Sahay, Rajeev
    Mahfuz, Rehana
    El Gamal, Aly
    2019 53RD ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2019,
  • [7] Arabic text clustering using improved clustering algorithms with dimensionality reduction
    Arun Kumar Sangaiah
    Ahmed E. Fakhry
    Mohamed Abdel-Basset
    Ibrahim El-henawy
    Cluster Computing, 2019, 22 : 4535 - 4549
  • [8] Arabic text clustering using improved clustering algorithms with dimensionality reduction
    Sangaiah, Arun Kumar
    Fakhry, Ahmed E.
    Abdel-Basset, Mohamed
    El-henawy, Ibrahim
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (02): : S4535 - S4549
  • [9] A novel dimensionality reduction approach for ECG signal via convolutional denoising autoencoder with LSTM
    Dasan, Evangelin
    Panneerselvam, Ithayarani
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 63
  • [10] Effective Pattern Discovery and Dimensionality Reduction for Text Under Text Mining
    Vijayakumar, T.
    Priya, R.
    Palanisamy, C.
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY ALGORITHMS IN ENGINEERING SYSTEMS, VOL 2, 2015, 325 : 615 - 623