Optimization of hidden Markov model by a genetic algorithm for web information extraction

被引:0
|
作者
Xiao, Jiyi [1 ]
Zou, Lamei [1 ]
Li, Chuanqi [1 ]
机构
[1] Univ S China, Sch Comp Sci & Technol, Hengyang 421001, Peoples R China
关键词
hidden Markov model; genetic algorithm; Baum-Welch algorithm; Viterbi algorithm; information extraction;
D O I
10.2991/iske.2007.48
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper demonstrates a new training method based on GA and Baum-Welch algorithms to obtain an HMM model with optimized number of states in the HMM models and its model parameters for web information extraction. This method is not only able to overcome the shortcomings of the slow convergence speed of the HMM approach. In addition, this method also finds better number of states in the HMM topology as well as its model parameters. From the experiments with the 2100 webs extracted from our corpus, this method is able to find the optimal topology in all cases. The experiments are found that the GA-HMM approach has an average precision rate of 84.483% while the HMM trained by the Baum-Welch method has an average precision rate of 71.049%. This implies that the GA-HMM method is more optimized than the HMM trained by the Baum-Welch method.
引用
收藏
页数:1
相关论文
共 50 条
  • [1] Text Information Extraction based on Genetic Algorithm and Hidden Markov Model
    Li, Rong
    Zheng, Jia-heng
    Pei, Chun-qin
    [J]. PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON EDUCATION TECHNOLOGY AND COMPUTER SCIENCE, VOL I, 2009, : 334 - +
  • [2] Web information extraction using generalized hidden Markov model
    Zhong, Ping
    Chen, Jinlin
    Cook, Terry
    [J]. 2006 1ST IEEE WORKSHOP ON HOT TOPICS IN WEB SYSTEMS AND TECHNOLOGIES, 2006, : 142 - +
  • [3] A generalized hidden Markov model approach for web information extraction
    Zhong, Ping
    Chen, Jinlin
    [J]. 2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 709 - +
  • [4] Web information extraction based on a Generalized Hidden Markov Model
    Yao, Yong
    Wang, Jing
    Liu, Zhijing
    [J]. Journal of Computational Information Systems, 2007, 3 (05): : 1847 - 1854
  • [5] Web object information extraction based on generalized hidden Markov model
    Wang, Jing
    Yao, Yong
    Liu, ZhiJing
    [J]. 2007 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES, VOLS 1-3, 2007, : 1520 - 1523
  • [6] Extraction of Key Information in Web News Based on Improved Hidden Markov Model
    Liu Z.
    Du Y.
    Shi S.
    [J]. Data Analysis and Knowledge Discovery, 2019, 3 (03) : 120 - 128
  • [7] Information extraction algorithm based on multiple templates using hidden Markov model
    College of Information Technology, Jiangxi University of Finance and Economy, Nanchang 330013, China
    不详
    不详
    [J]. Jisuanji Gongcheng, 2006, 2 (203-205):
  • [8] Information Extraction System Based on Hidden Markov Model
    Park, Dong-Chul
    Huong, Vu Thi Lan
    Woo, Dong-Min
    Hieu, Duong Ngoc
    Ninh, Sai Thi Hien
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2009, PT 1, PROCEEDINGS, 2009, 5551 : 52 - +
  • [9] Web information extraction based on genetic algorithm
    Guo, Yin-Rui
    Chen, Rong
    [J]. Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2011, 24 (03): : 385 - 390
  • [10] Information extraction incorporating paragraph feature and hidden Markov Model
    Na, Liu
    Lu, Mingyu
    Tang, Huanling
    [J]. 2007 IFIP INTERNATIONAL CONFERENCE ON NETWORK AND PARALLEL COMPUTING WORKSHOPS, PROCEEDINGS, 2007, : 953 - 956