Keyphrase extraction from Chinese news web pages based on semantic relations

被引:0
|
作者
Xie, Fei [1 ,4 ]
Wu, Xindong [1 ,2 ]
Hu, Xue-Gang [1 ]
Wang, Fei-Yue [3 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230009, Peoples R China
[2] Univ Vermont, Dept Comp Sci, Burlington, VT 50405 USA
[3] Chinese Acad Sci, Inst Automat, Beijing 100864, Peoples R China
[4] Hefei Teachers Coll, Dept Comp Sci & Technol, Hefei 230061, Peoples R China
关键词
keyphrase extraction; semantic relation; word similarity; word co-occurrence; lexical chain;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Keyphrases are very useful for saving time on browsing through the news web pages. A new keyphrase extraction method from Chinese news web pages based on semantic relations is presented in this paper. Semantic relations between phrases are analyzed, and a lexical chain is used to construct a semantic relation graph. Keyphrases are extracted and a semantic link graph is built on the lexical chains. News web pages with core hints are selected from www.163.com to test our method. The experimental results show that the proposed method substantially outperforms the method based on term frequency, especially when the number of keyphrases extracted is 3 - the precision is improved by 26.97 percent, and the recall is improved by 20.93 percent.
引用
收藏
页码:490 / +
页数:2
相关论文
共 50 条
  • [31] Automatic Data Extraction from Lists in Web Pages Based on XML
    Xin, Zhou
    Hao, Wang
    ADVANCED TECHNOLOGY IN TEACHING - PROCEEDINGS OF THE 2009 3RD INTERNATIONAL CONFERENCE ON TEACHING AND COMPUTATIONAL SCIENCE (WTCS 2009), VOL 2: EDUCATION, PSYCHOLOGY AND COMPUTER SCIENCE, 2012, 117 : 915 - 921
  • [32] Visual extraction of information from web pages
    Della Penna, Giuseppe
    Magazzeni, Daniele
    Orefice, Sergio
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2010, 21 (01): : 23 - 32
  • [33] Data extraction from Deep Web pages
    Yang, Jufeng
    Shi, Guangshun
    Zheng, Yan
    Wang, Qingren
    CIS: 2007 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY, PROCEEDINGS, 2007, : 237 - 241
  • [34] Extraction of Informative Blocks from Web Pages
    Cao, YuJuan
    Niu, ZhenDong
    Dai, LiuLing
    Zhao, YuMing
    ALPIT 2008: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 544 - 549
  • [35] Advertising Keywords Extraction from Web Pages
    Liu, Jianyi
    Wang, Cong
    Liu, Zhengyang
    Yao, Wenbin
    WEB INFORMATION SYSTEMS AND MINING, 2010, 6318 : 336 - 343
  • [36] Extraction of hidden semantics from web pages
    Carchiolo, V
    Longheu, A
    Malgeri, M
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2002, 2002, 2412 : 117 - 122
  • [37] Keyphrase Extraction Based on Optimized Random Walks on Multiple Word Relations
    Chen, Wenyan
    Liu, Zheng
    Shi, Wei
    Yu, Jeffrey Xu
    WEB AND BIG DATA (APWEB-WAIM 2018), PT II, 2018, 10988 : 359 - 367
  • [38] Chinese Web News Source Extraction Algorithm Based On Rules And Region Recognition
    Liu, Zhiming
    Liu, Lu
    Cai, Huali
    NINTH WUHAN INTERNATIONAL CONFERENCE ON E-BUSINESS, VOLS I-III, 2010, : 2029 - 2034
  • [39] Extraction of Semantic Relations from Medical Literature Based on Semantic Predicates and SVM
    Zhao, Xiaoli
    Lin, Shaofu
    Huang, Zhisheng
    HEALTH INFORMATION SCIENCE (HIS 2018), 2018, 11148 : 17 - 24
  • [40] A novel chinese web news source extraction algorithm
    Liu Z.
    Liu L.
    Journal of Convergence Information Technology, 2011, 6 (08) : 99 - 106