Research of text segmentation based on parallel genetic algorithm

被引:0
|
作者
Zhao, Yu [1 ]
Cai, Wandong [1 ]
Fan, Na [1 ]
Liu, Nian [2 ]
机构
[1] College of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China
[2] Library, Xi'an University of Architecture and Technology, Xi'an 710055, China
关键词
Chinese information processing - Global optimal solutions - Internal cohesion - Latent semantics - Multi-objective optimization problem - Objective functions - Parallel genetic algorithms - Text segmentation;
D O I
暂无
中图分类号
学科分类号
摘要
Focusing on the data sparseness of short texts, an algorithm based on knowledge from external corpus is proposed to improve the accuracy of text segmentation, which contains two steps: Gibbs sampling is adopted to estimate the LDA model; corresponding to the corpus and the latent semantic structure information of the text is inferred based on the LDA model. Two objective functions of internal cohesion and external dissimilarity are then defined to transform text segmentation into a multi-objective optimization problem. A parallel genetic algorithm based on the objective functions is employed to obtain the global optimal solution for text segmentation. According to the experiments, the proposed algorithm achieves higher accuracy than the MDA and LDA-based methods in the case of data sparseness.
引用
下载
收藏
页码:40 / 44
相关论文
共 50 条
  • [1] Research of text clustering based on hybrid Parallel Genetic Algorithm
    Dai, Wenhua
    Rao, Guizhen
    He, Tingting
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, PROCEEDINGS, 2007, : 28 - 31
  • [2] Research on Text Feature Clustering Based on Improved Parallel Genetic Algorithm
    Jiang, Mingyang
    Fan, Xiaojing
    Pei, Zhili
    Zhang, Zhifeng
    PROCEEDINGS OF 2018 TENTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2018, : 235 - 238
  • [3] Research on Text Feature Extraction Based on Hybrid Parallel Genetic Algorithm
    Dai, Wenhua
    Jiao, Cuizhen
    He, Tingting
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 5581 - +
  • [4] Research of Image Segmentation Based on Genetic Algorithm
    Zhu Meining
    Hu Zhili
    Chen Xiumin
    THIRD INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2011), 2011, 8009
  • [5] Image Segmentation Research Based on Genetic Algorithm
    Guan Xiao-wei
    Zhu Xia
    MEMS, NANO AND SMART SYSTEMS, PTS 1-6, 2012, 403-408 : 1622 - +
  • [6] RESEARCH ABOUT TEXT IMAGE SEGMENTATION BASED ON THE ALGORITHM OF LLT
    Guo, Long-Yuan
    Li, Jian-Ping
    Chen, Xia
    Yang, Yujun
    2015 12TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2015, : 299 - 302
  • [7] Research on The parallel Text Clustering Algorithm Based on the Semantic Tree
    Liu, Gangfeng
    Wang, Yunlan
    Zhao, Tianhai
    Li, Dongyang
    2011 6TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCES AND CONVERGENCE INFORMATION TECHNOLOGY (ICCIT), 2012, : 400 - 403
  • [8] SegGen: A genetic algorithm for linear text segmentation
    LERIA, Université d'Angers, 2, Bd Lavoisier, 49045 Angers, France
    IJCAI Int. Joint Conf. Artif. Intell., (1647-1652):
  • [9] SegGen: a Genetic Algorithm for Linear Text Segmentation
    Lamprier, S.
    Amghar, T.
    Levrat, B.
    Saubion, F.
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1647 - 1652
  • [10] A Genetic Algorithm for Logical Topic Text Segmentation
    Mihaila, Alin
    Mihis, Andreea
    Mihaila, Cristina
    2008 THIRD INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT, VOLS 1 AND 2, 2008, : 511 - 516