H-Rank: A keywords extraction method from web pages using POS tags

被引:0
|
作者
Shah, Himat [1 ]
Khan, Muhammad U. S. [1 ]
Franti, Pasi [1 ]
机构
[1] Univ Eastern Finland, Sch Comp, Joensuu, Finland
关键词
Agglomerative clustering; POS tags; Web pages;
D O I
10.1109/indin41052.2019.8972331
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We present a new keywords extraction method that applies the semantic similarity among the frequent words on the web page along with the distribution of POS tags. We apply hierarchical clustering to cluster the semantically similar words that have more coverage of the content of the web page. Our method shows better performance than CL-Rank and other existing methodologies.
引用
收藏
页码:264 / 269
页数:6
相关论文
共 50 条
  • [1] Advertising Keywords Extraction from Web Pages
    Liu, Jianyi
    Wang, Cong
    Liu, Zhengyang
    Yao, Wenbin
    WEB INFORMATION SYSTEMS AND MINING, 2010, 6318 : 336 - 343
  • [2] CIRank: A Method for Keyword Extraction from Web pages using clustering and distribution of nouns
    Rezaei, Mohammad
    Gali, Najlah
    Franti, Pasi
    2015 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT), VOL 1, 2015, : 79 - 84
  • [3] Information Extraction from Web pages
    Novotny, Robert
    Vojtas, Peter
    Maruscak, Dusan
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 3, 2009, : 121 - +
  • [4] Extraction of core web content from web pages using noise elimination
    Saravanan A.
    Bama S.S.
    Journal of Engineering Science and Technology Review, 2020, 13 (04) : 173 - 187
  • [5] Extraction of web news from web pages using a ternary tree approach
    Laishram, Debina
    Sebastian, Merin
    2015 SECOND INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING AND COMMUNICATION ENGINEERING ICACCE 2015, 2015, : 628 - 633
  • [6] Advertising Keywords Recommendation for Short-Text Web Pages Using Wikipedia
    Zhang, Weinan
    Wang, Dingquan
    Xue, Gui-Rong
    Zha, Hongyuan
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2012, 3 (02)
  • [7] Visual extraction of information from web pages
    Della Penna, Giuseppe
    Magazzeni, Daniele
    Orefice, Sergio
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2010, 21 (01): : 23 - 32
  • [8] qRead: A Fast and Accurate Article Extraction Method from Web Pages using Partition Features Optimizations
    Wang, Jingwen
    Wang, Jie
    2015 7TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (IC3K), 2015, : 364 - 371
  • [9] Data extraction from Deep Web pages
    Yang, Jufeng
    Shi, Guangshun
    Zheng, Yan
    Wang, Qingren
    CIS: 2007 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY, PROCEEDINGS, 2007, : 237 - 241
  • [10] Extraction of Informative Blocks from Web Pages
    Cao, YuJuan
    Niu, ZhenDong
    Dai, LiuLing
    Zhao, YuMing
    ALPIT 2008: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 544 - 549