Keyphrase extraction from Chinese news web pages based on semantic relations

被引：0

作者：

Xie, Fei ^{[1
,4
]}

Wu, Xindong ^{[1
,2
]}

Hu, Xue-Gang ^{[1
]}

Wang, Fei-Yue ^{[3
]}

机构：

[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230009, Peoples R China

[2] Univ Vermont, Dept Comp Sci, Burlington, VT 50405 USA

[3] Chinese Acad Sci, Inst Automat, Beijing 100864, Peoples R China

[4] Hefei Teachers Coll, Dept Comp Sci & Technol, Hefei 230061, Peoples R China

来源：

INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS | 2008年 / 5075卷

关键词：

keyphrase extraction; semantic relation; word similarity; word co-occurrence; lexical chain;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Keyphrases are very useful for saving time on browsing through the news web pages. A new keyphrase extraction method from Chinese news web pages based on semantic relations is presented in this paper. Semantic relations between phrases are analyzed, and a lexical chain is used to construct a semantic relation graph. Keyphrases are extracted and a semantic link graph is built on the lexical chains. News web pages with core hints are selected from www.163.com to test our method. The experimental results show that the proposed method substantially outperforms the method based on term frequency, especially when the number of keyphrases extracted is 3 - the precision is improved by 26.97 percent, and the recall is improved by 20.93 percent.

引用

页码：490 / +

页数：2

共 50 条

[31] Automatic Data Extraction from Lists in Web Pages Based on XML
Xin, Zhou
Hao, Wang
ADVANCED TECHNOLOGY IN TEACHING - PROCEEDINGS OF THE 2009 3RD INTERNATIONAL CONFERENCE ON TEACHING AND COMPUTATIONAL SCIENCE (WTCS 2009), VOL 2: EDUCATION, PSYCHOLOGY AND COMPUTER SCIENCE, 2012, 117 : 915 - 921
[32] Visual extraction of information from web pages
Della Penna, Giuseppe
Magazzeni, Daniele
Orefice, Sergio
JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2010, 21 (01): : 23 - 32
[33] Data extraction from Deep Web pages
Yang, Jufeng
Shi, Guangshun
Zheng, Yan
Wang, Qingren
CIS: 2007 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY, PROCEEDINGS, 2007, : 237 - 241
[34] Extraction of Informative Blocks from Web Pages
Cao, YuJuan
Niu, ZhenDong
Dai, LiuLing
Zhao, YuMing
ALPIT 2008: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 544 - 549
[35] Advertising Keywords Extraction from Web Pages
Liu, Jianyi
Wang, Cong
Liu, Zhengyang
Yao, Wenbin
WEB INFORMATION SYSTEMS AND MINING, 2010, 6318 : 336 - 343
[36] Extraction of hidden semantics from web pages
Carchiolo, V
Longheu, A
Malgeri, M
INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2002, 2002, 2412 : 117 - 122
[37] Keyphrase Extraction Based on Optimized Random Walks on Multiple Word Relations
Chen, Wenyan
Liu, Zheng
Shi, Wei
Yu, Jeffrey Xu
WEB AND BIG DATA (APWEB-WAIM 2018), PT II, 2018, 10988 : 359 - 367
[38] Chinese Web News Source Extraction Algorithm Based On Rules And Region Recognition
Liu, Zhiming
Liu, Lu
Cai, Huali
NINTH WUHAN INTERNATIONAL CONFERENCE ON E-BUSINESS, VOLS I-III, 2010, : 2029 - 2034
[39] Extraction of Semantic Relations from Medical Literature Based on Semantic Predicates and SVM
Zhao, Xiaoli
Lin, Shaofu
Huang, Zhisheng
HEALTH INFORMATION SCIENCE (HIS 2018), 2018, 11148 : 17 - 24
[40] A novel chinese web news source extraction algorithm
Liu Z.
Liu L.
Journal of Convergence Information Technology, 2011, 6 (08) : 99 - 106

← 1 2 3 4 5 →