Semi supervised classification of scientific and technical literature based on semi supervised hierarchical description of improved latent dirichlet allocation (LDA)

被引:0
|
作者
Yongjun Zhang
Jialin Ma
Zijian Wang
机构
[1] Hohai University,College of Computer and Information
[2] Huaiyin Institute of Technology,Faculty of Computer and Software Engineering
来源
Cluster Computing | 2019年 / 22卷
关键词
Scientific literature; LDA; Domain ontology graph; Word disambiguation; Semi-surprised; Conceptual clustering;
D O I
暂无
中图分类号
学科分类号
摘要
Chinese text classification problem was studied based on domain ontology graph (DOG) of semi-supervised conceptual clustering to solve the problem that English word disambiguation method cannot be applied to Chinese text classification. Structure model of domain ontology graph, text classification algorithm in HowNet dictionary and KLSeeker ontology and so on were used to realize accurate classification of Chinese text and display effectiveness of algorithm. Chinese text classification model in domain ontology graph based on conceptual clustering was developed from the angle of decreasing human participation in ontology construction as much as possible in the paper. Aimed at application domain of Chinese web text, the algorithm can generate DOG of knowledge conceptualization automatically. At the same time, document ontology graph (DocOG) was defined to represent contents of individual text document. DocOG extracting target realized text classification based on ontology by matching of single document ontology and domain ontology. Finally, example calculation analysis and actual data test set experiment were given in experimental stage. The result shows that proposed Chinese text classification method has higher classification accuracy and reflects effectiveness of design.
引用
收藏
页码:6881 / 6889
页数:8
相关论文
共 50 条
  • [1] Semi supervised classification of scientific and technical literature based on semi supervised hierarchical description of improved latent dirichlet allocation (LDA)
    Zhang, Yongjun
    Ma, Jialin
    Wang, Zijian
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 3): : S6881 - S6889
  • [2] Semi-supervised Document Clustering Based on Latent Dirichlet Allocation (LDA)
    秦永彬
    李解
    黄瑞章
    李晶
    [J]. Journal of Donghua University(English Edition), 2016, 33 (05) : 685 - 688
  • [3] Semi-Supervised Latent Dirichlet Allocation and its Application for Document Classification
    Wang, Di
    Thint, Marcus
    Al-Rubaie, Ahmad
    [J]. 2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY WORKSHOPS (WI-IAT WORKSHOPS 2012), VOL 3, 2012, : 306 - 310
  • [4] Automated classification of software change messages by semi-supervised Latent Dirichlet Allocation
    Fu, Ying
    Yan, Meng
    Zhang, Xiaohong
    Xu, Ling
    Yang, Dan
    Kymer, Jeffrey D.
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2015, 57 : 369 - 377
  • [5] Regularized Semi-Supervised Latent Dirichlet Allocation for visual concept learning
    Zhuang, Liansheng
    Gao, Haoyuan
    Luo, Jiebo
    Lin, Zhouchen
    [J]. NEUROCOMPUTING, 2013, 119 : 26 - 32
  • [6] Regularized Semi-supervised Latent Dirichlet Allocation for Visual Concept Learning
    Zhuang, Liansheng
    She, Lanbo
    Huang, Jingjing
    Luo, Jiebo
    Yu, Nenghai
    [J]. ADVANCES IN MULTIMEDIA MODELING, PT I, 2011, 6523 : 403 - +
  • [7] Multilayer classification of web pages using Random Forest and semi-supervised Latent Dirichlet Allocation
    Sayadi, Karim
    Bui, Quang Vu
    Bui, Marc
    [J]. 2015 15TH INTERNATIONAL CONFERENCE ON INNOVATIONS FOR COMMUNITY SERVICES (I4CS), 2015,
  • [8] Randomized feature selection based semi-supervised latent Dirichlet allocation for microbiome analysis
    Pais, Namitha
    Ravishanker, Nalini
    Rajasekaran, Sanguthevar
    Weinstock, George
    Tran, Dong-Binh
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01)
  • [10] Semi-Supervised Hierarchical Graph Classification
    Li, Jia
    Huang, Yongfeng
    Chang, Heng
    Rong, Yu
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (05) : 6265 - 6276