Keyphrase extraction from Chinese news web pages based on semantic relations

被引:0
|
作者
Xie, Fei [1 ,4 ]
Wu, Xindong [1 ,2 ]
Hu, Xue-Gang [1 ]
Wang, Fei-Yue [3 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230009, Peoples R China
[2] Univ Vermont, Dept Comp Sci, Burlington, VT 50405 USA
[3] Chinese Acad Sci, Inst Automat, Beijing 100864, Peoples R China
[4] Hefei Teachers Coll, Dept Comp Sci & Technol, Hefei 230061, Peoples R China
关键词
keyphrase extraction; semantic relation; word similarity; word co-occurrence; lexical chain;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Keyphrases are very useful for saving time on browsing through the news web pages. A new keyphrase extraction method from Chinese news web pages based on semantic relations is presented in this paper. Semantic relations between phrases are analyzed, and a lexical chain is used to construct a semantic relation graph. Keyphrases are extracted and a semantic link graph is built on the lexical chains. News web pages with core hints are selected from www.163.com to test our method. The experimental results show that the proposed method substantially outperforms the method based on term frequency, especially when the number of keyphrases extracted is 3 - the precision is improved by 26.97 percent, and the recall is improved by 20.93 percent.
引用
收藏
页码:490 / +
页数:2
相关论文
共 50 条
  • [21] Keyword Extraction Based on Multi-feature Fusion for Chinese Web Pages
    He, Qi
    Hao, Hong-Wei
    Yin, Xu-Cheng
    PROCEEDINGS OF THE 2011 2ND INTERNATIONAL CONGRESS ON COMPUTER APPLICATIONS AND COMPUTATIONAL SCIENCE, VOL 1, 2012, 144 : 119 - 124
  • [22] Information Extraction from Web pages
    Novotny, Robert
    Vojtas, Peter
    Maruscak, Dusan
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 3, 2009, : 121 - +
  • [23] Keyphrase Distance Analysis Technique from News Articles as a Feature for Keyphrase Extraction: An Unsupervised Approach
    Miah, Mohammad Badrul Alam
    Awang, Suryanti
    Rahman, Md Mustafizur
    Hosen, A. S. M. Sanwar
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (10) : 995 - 1002
  • [24] Clustering suggestion for Chinese news web pages from multi-media sources
    Chiu, Deng-Yiv
    Pan, Ya-Chen
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (IACSIT ICMLC 2009), 2009, : 183 - 187
  • [25] Chinese Term Extraction From Web Pages Based On Expected Point-wise Mutual Information
    Du, Liping
    Li, Xiaoge
    Lin, Dayi
    2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 1647 - 1651
  • [26] Semantic Sense Extraction From Wikipedia Pages
    Pirrone, Roberto
    Pipitone, Arianna
    Russo, Giuseppe
    3RD INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION, 2010, : 543 - 547
  • [27] Automatic keyphrase extraction for Arabic news documents based on KEA system
    Duwairi, Rehab
    Hedaya, Mona
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 30 (04) : 2101 - 2110
  • [28] A Semantic-Based Approach for Keyphrase Extraction from Vietnamese Documents Using Thematic Vector
    Linh Viet Le
    Tho Thi Ngoc Le
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, PT I, 2022, 13757 : 416 - 427
  • [29] Extracting Content for News Web Pages based on DOM
    Geng, Hua
    Gao, Qiang
    Pan, Jingui
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2007, 7 (02): : 124 - 129
  • [30] Opinion Content Extraction from Web Pages Using Embedded Semantic Term Tree Kernels
    Pagi, Veerappa B.
    Wadawadagi, Ramesh S.
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA ENGINEERING, 2018, 9 : 345 - 358