Semantics-based topic inter-relationship extraction

被引:12
|
作者
Menon, Remya R. K. [1 ]
Joseph, Deepthy [1 ]
Kaimal, M. R. [1 ]
机构
[1] Amrita Univ, Amritapuri Amrita Vishwa Vidyapeetham, Amrita Sch Engn, Dept Comp Sci & Engn, Coimbatore, Tamil Nadu, India
关键词
LDA; LSA; Singular Value Decomposition (SVD); probabilistic model; vector space model;
D O I
10.3233/JIFS-169237
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Maintaining large collection of documents is an important problem in many areas of science and industry. Different analysis can be performed on large document collection with ease only if a short or reduced description can be obtained. Topic modeling offers a promising solution for this. Topic modeling is a method that learns about hidden themes from a large set of unorganized documents. Different approaches and alternatives are available for finding topics, such as Latent Dirichlet Allocation (LDA), neural networks, Latent Semantic Analysis (LSA), probabilistic LSA (pLSA), probabilistic LDA (pLDA). In topic models the topics inferred are based only on observing the term occurrence. However, the terms may not be semantically related in a manner that is relevant to the topic. Understanding the semantics can yield improved topics for representing the documents. The objective of this paper is to develop a semantically oriented probabilistic model based approach for generating topic representation from the document collection. From the modified topic model, we generate 2 matrices-a document-topic and a term-topic matrix. The reduced document-term matrix derived from these two matrices has 85% similarity with the original document-term matrix i.e. we get 85% similarity between the original document collection and the documents reconstructed from the above two matrices. Also, a classifier when applied to the document-topic matrix appended with the class label, shows an 80% improvement in F-measure score. The paper also uses the perplexity metric to find out the number of topics for a test set.
引用
收藏
页码:2941 / 2951
页数:11
相关论文
共 50 条
  • [1] Semantics-Based Browsing using Latent Topic Warped Indexes
    Sathish, Sailesh Kumar
    Patankar, Anish Anil
    Neema, Nirmesh
    [J]. 2016 IEEE TENTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2016, : 377 - 380
  • [2] Semantics-based information extraction for detecting economic events
    Hogenboom, Alexander
    Hogenboom, Frederik
    Frasincar, Flavius
    Schouten, Kim
    van der Meer, Otto
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2013, 64 (01) : 27 - 52
  • [3] Semantics-based content extraction in typewritten historical documents
    Antonacopoulos, A
    Karatzas, D
    [J]. EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 48 - 53
  • [4] Semantics-based information extraction for detecting economic events
    Alexander Hogenboom
    Frederik Hogenboom
    Flavius Frasincar
    Kim Schouten
    Otto van der Meer
    [J]. Multimedia Tools and Applications, 2013, 64 : 27 - 52
  • [5] Semantic extraction and semantics-based annotation and retrieval for video databases
    Liu, Y
    Li, F
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2002, 17 (01) : 5 - 20
  • [6] Semantic Extraction and Semantics-Based Annotation and Retrieval for Video Databases
    Yan Liu
    Fei Li
    [J]. Multimedia Tools and Applications, 2002, 17 : 5 - 20
  • [7] Using time topic modeling for semantics-based dynamic research interest finding
    Daud, Ali
    [J]. KNOWLEDGE-BASED SYSTEMS, 2012, 26 : 154 - 163
  • [8] Semantics-based highlight extraction of soccer program using DBN
    Chao, CY
    Shih, HC
    Huang, CL
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 1057 - 1060
  • [9] INTER-RELATIONSHIP OF (-)-KAURENE AND (+)-PHYLLOCLADENE
    CROSS, BE
    CAMBIE, RC
    RUTLEDGE, PS
    HANSON, JR
    BRIGGS, LH
    [J]. PROCEEDINGS OF THE CHEMICAL SOCIETY OF LONDON, 1963, (JAN): : 17 - &
  • [10] INTER-RELATIONSHIP BETWEEN NEUROTRANSMITTERS
    LEONARD, BE
    [J]. NEUROPHARMACOLOGY, 1984, 23 (2B) : 213 - 218