Locally Embedding Autoencoders: A Semi-Supervised Manifold Learning Approach of Document Representation

被引:13
|
作者
Wei, Chao [1 ]
Luo, Senlin [1 ]
Ma, Xincheng [1 ]
Ren, Hao [1 ]
Zhang, Ji [1 ]
Pan, Limin [1 ]
机构
[1] Beijing Inst Technol, Beijing 10081, Peoples R China
来源
PLOS ONE | 2016年 / 11卷 / 01期
关键词
NETWORK;
D O I
10.1371/journal.pone.0146672
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Topic models and neural networks can discover meaningful low-dimensional latent representations of text corpora; as such, they have become a key technology of document representation. However, such models presume all documents are non-discriminatory, resulting in latent representation dependent upon all other documents and an inability to provide discriminative document representation. To address this problem, we propose a semi-supervised manifold-inspired autoencoder to extract meaningful latent representations of documents, taking the local perspective that the latent representation of nearby documents should be correlative. We first determine the discriminative neighbors set with Euclidean distance in observation spaces. Then, the autoencoder is trained by joint minimization of the Bernoulli cross-entropy error between input and output and the sum of the square error between neighbors of input and output. The results of two widely used corpora show that our method yields at least a 15% improvement in document clustering and a nearly 7% improvement in classification tasks compared to comparative methods. The evidence demonstrates that our method can readily capture more discriminative latent representation of new documents. Moreover, some meaningful combinations of words can be efficiently discovered by activating features that promote the comprehensibility of latent representation.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] Learning to Rank with Query-level Semi-supervised Autoencoders
    Xu, Bo
    Lin, Hongfei
    Lin, Yuan
    Xu, Kan
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 2395 - 2398
  • [42] Semi-supervised learning based on smooth manifold integration
    Yang, G. (glyang@mail.ustc.edu.cn), 2012, Binary Information Press, P.O. Box 162, Bethel, CT 06801-0162, United States (08):
  • [43] Flexible Manifold Embedding: A Framework for Semi-Supervised and Unsupervised Dimension Reduction
    Nie, Feiping
    Xu, Dong
    Tsang, Ivor Wai-Hung
    Zhang, Changshui
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2010, 19 (07) : 1921 - 1932
  • [44] Manifold Coarse Graining for Online Semi-supervised Learning
    Farajtabar, Mehrdad
    Shaban, Amirreza
    Rabiee, Hamid Reza
    Rohban, Mohammad Hossein
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2011, 6911 : 391 - 406
  • [45] Robust Semi-Supervised Manifold Learning Algorithm for Classification
    Chen, Mingxia
    Wang, Jing
    Li, Xueqing
    Sun, Xiaolong
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2018, 2018
  • [46] Accelerated manifold embedding for multi-view semi-supervised classification
    Wang, Shiping
    Wang, Zhewen
    Guo, Wenzhong
    INFORMATION SCIENCES, 2021, 562 (562) : 438 - 451
  • [47] Semi-Supervised Clustering with Multiresolution Autoencoders
    Ienco, Dino
    Pensa, Ruggero G.
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [48] On semi-supervised multiple representation behavior learning
    Lu, Ruqian
    Hou, Shengluan
    JOURNAL OF COMPUTATIONAL SCIENCE, 2020, 46
  • [49] A semi-supervised learning approach for semantic parsing boosted by BERT word embedding
    Bu, Yanbin
    Chen, Ting
    Duan, Hongxiu
    Liu, Mei
    Xue, Yandan
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (03) : 6577 - 6588
  • [50] Graph Ensemble Networks for Semi-supervised Embedding Learning
    Tang, Hui
    Liang, Xun
    Wu, Bo
    Guan, Zhenyu
    Guo, Yuhui
    Zheng, Xiangping
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, 2021, 12815 : 408 - 420