Advertising Keywords Extraction from Web Pages

被引:0
|
作者
Liu, Jianyi [1 ]
Wang, Cong [1 ]
Liu, Zhengyang [1 ]
Yao, Wenbin [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Comp Sci, Beijing 100876, Peoples R China
来源
关键词
Keyword extraction; information extraction; advertising; PageRank;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A large and growing number of web pages display contextual advertising based on keywords automatically extracted from the text of the page, and it has been become a rapidly growing business in recent years. We describe a system that learns how to extract keywords from web pages for advertisement targeting. Firstly a text network for a single webpage is build, then Page Rank is applied in the network to decide on the importance of a word, finally top-ranked words are selected as keywords of the webpage. The algorithm is tested on the corpus of blog pages, and the experiment result proves practical and effective.
引用
下载
收藏
页码:336 / 343
页数:8
相关论文
共 50 条
  • [31] Extraction of ontologies from web pages: Conceptual modelling and tourism application
    Riadi-GDL Laboratory, ENSI Campus, Universitaire de la Manouba, Tunisia
    不详
    不详
    J. Internet Technol., 2007, 4 (411-421):
  • [32] Data extraction from semi-structured web pages by clustering
    Vuong, Le Phong Bao
    Gao, Xiaoying
    Zhang, Mengjie
    2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 374 - +
  • [33] Pattern Matching for Extraction of Core Contents from News Web Pages
    Sirsat, Sandeep
    Chavan, Vinay
    2016 SECOND INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2016, : 13 - 18
  • [34] Leveraging spatial join for robust tuple extraction from web pages
    Han, Wook-Shin
    Kwak, Wooseong
    Yu, Hwanjo
    Lee, Jeong-Hoon
    Kim, Min-Soo
    INFORMATION SCIENCES, 2014, 261 : 132 - 148
  • [35] Bootstrapping Information Extraction from Semi-structured Web Pages
    Carlson, Andrew
    Schafer, Charles
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PART I, PROCEEDINGS, 2008, 5211 : 195 - +
  • [36] Content Extraction from Web Pages Based on Chinese Punctuation Number
    Song, Mingqiu
    Wu, Xintao
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 5573 - 5575
  • [37] An Approach to Image Extraction and Accurate Skin Detection from Web Pages
    Girgis, Moheb R.
    Mahmoud, Tarek M.
    Abd-El-Hafeez, Tarek
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 21, 2007, 21 : 367 - 375
  • [38] A Geo-Tagging Framework for Address Extraction from Web Pages
    Efremova, Julia
    Endres, Ian
    Vidas, Isaac
    Melnik, Ofer
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS (ICDM 2018), 2018, 10933 : 288 - 295
  • [39] Simultaneous product attribute name and value extraction from web pages
    Wu, Bo
    Cheng, Xueqi
    Wang, Yu
    Guo, Yan
    Song, Linhai
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 3, 2009, : 295 - 298
  • [40] Automatic data extraction from data-rich web pages
    Hu, DD
    Meng, XF
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2005, 3453 : 828 - 839