Keyphrase extraction from Chinese news web pages based on semantic relations

被引:0
|
作者
Xie, Fei [1 ,4 ]
Wu, Xindong [1 ,2 ]
Hu, Xue-Gang [1 ]
Wang, Fei-Yue [3 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230009, Peoples R China
[2] Univ Vermont, Dept Comp Sci, Burlington, VT 50405 USA
[3] Chinese Acad Sci, Inst Automat, Beijing 100864, Peoples R China
[4] Hefei Teachers Coll, Dept Comp Sci & Technol, Hefei 230061, Peoples R China
关键词
keyphrase extraction; semantic relation; word similarity; word co-occurrence; lexical chain;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Keyphrases are very useful for saving time on browsing through the news web pages. A new keyphrase extraction method from Chinese news web pages based on semantic relations is presented in this paper. Semantic relations between phrases are analyzed, and a lexical chain is used to construct a semantic relation graph. Keyphrases are extracted and a semantic link graph is built on the lexical chains. News web pages with core hints are selected from www.163.com to test our method. The experimental results show that the proposed method substantially outperforms the method based on term frequency, especially when the number of keyphrases extracted is 3 - the precision is improved by 26.97 percent, and the recall is improved by 20.93 percent.
引用
收藏
页码:490 / +
页数:2
相关论文
共 50 条
  • [41] Spatiotemporal and semantic information extraction from Web news reports about natural hazards
    Wang, Wei
    Stewart, Kathleen
    COMPUTERS ENVIRONMENT AND URBAN SYSTEMS, 2015, 50 : 30 - 40
  • [42] ENHANCING TOPIC TRACKING FOR CHINESE NEWS WEB PAGES WITH TEMPORAL INFORMATION AND KEY WEB CONTEXTS
    Qiu, Jing
    Liao, Lejian
    Li, Peng
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2010, 6 (01): : 399 - 408
  • [43] KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data
    Muhammad Aman
    Said Jadid Abdulkadir
    Izzatdin Abdul Aziz
    Hitham Alhussian
    Israr Ullah
    Multimedia Tools and Applications, 2021, 80 : 12469 - 12506
  • [44] KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data
    Aman, Muhammad
    Abdulkadir, Said Jadid
    Aziz, Izzatdin Abdul
    Alhussian, Hitham
    Ullah, Israr
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (08) : 12469 - 12506
  • [45] Discovery of semantic relationships among Web pages based on Web topic structures
    Matsukura, T
    Kondo, H
    Hirata, Y
    Tanaka, K
    SEMANTIC ISSUES IN E-COMMERCE SYSTEMS, 2003, 111 : 171 - 185
  • [46] Semantic Keywords-Based Duplicated Web Pages Removing
    Weng, Yunhe
    Li, Lei
    Zhong, Yixin
    IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 318 - 324
  • [47] A Style Sheets Based Approach for Semantic Transformation of Web Pages
    Prasad, Gollapudi V. R. J. Sai
    Choppella, Venkatesh
    Chimalakonda, Sridhar
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY (ICDCIT 2018), 2018, 10722 : 240 - 255
  • [48] CDL-BASED SEMANTIC REPRESENTATION FOR DYNAMIC WEB PAGES
    Farouk, Mamdouh
    Ishizuka, Mitsuru
    INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2012, 6 (01) : 51 - 65
  • [49] Semantic ranking of web pages based on formal concept analysis
    Du, Yajun
    Hai, YuFeng
    JOURNAL OF SYSTEMS AND SOFTWARE, 2013, 86 (01) : 187 - 197
  • [50] Study of Extraction for Web Pages Information Based on XML
    Li, Suming
    PROCEEDINGS OF THE 2016 2ND WORKSHOP ON ADVANCED RESEARCH AND TECHNOLOGY IN INDUSTRY APPLICATIONS, 2016, 81 : 829 - 832