Web Page Classification Based on Graph Neural Network

被引:1
|
作者
Guo, Tao [1 ]
Cui, Baojiang [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
关键词
D O I
10.1007/978-3-030-79728-7_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Web page, a kind of semi-structured document, includes a lot of additional attribute content besides text information. Traditional web page classification technology is mostly based on text classification methods. They ignore the additional attribute information of web page text. We propose WEB-GNN, an approach for Web page classification. There are two major contributions to this work. First, we propose a web page graph representation method called W2G that reconstructs text nodes into graph representation based on text visual association relationship and DOM-tree hierarchy relationship and realizes the efficient integration of web page content and structure. Our second contribution is to propose a web page classification method based on graph convolutional neural network. It takes the web page graph representation as to the input, integrates text features and structure features through graph convolution layer, and generates the advanced webpage feature representation. Experimental results on the Web-black dataset suggest that the proposed method significantly outperforms text-only method.
引用
收藏
页码:188 / 198
页数:11
相关论文
共 50 条
  • [21] An approach to Web page classification based on granules
    Duan, Qiguo
    Miao, Duoqian
    Wang, Ruizhi
    Chen, Min
    PROCEEDINGS OF THE IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE: WI 2007, 2007, : 279 - 282
  • [22] Chinese web page classification based on self-organizing mapping neural networks
    Liang, JZ
    ICCIMA 2003: FIFTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, PROCEEDINGS, 2003, : 96 - 101
  • [23] Web page feature selection and classification using neural networks
    Selamat, A
    Omatu, S
    INFORMATION SCIENCES, 2004, 158 : 69 - 88
  • [24] Hyperspectral Image Classification Based on Fusion of Convolutional Neural Network and Graph Network
    Gao, Luyao
    Xiao, Shulin
    Hu, Changhong
    Yan, Yang
    APPLIED SCIENCES-BASEL, 2023, 13 (12):
  • [25] A Convolutional Neural Network and Graph Convolutional Network Based Framework for AD Classification
    Lin, Lan
    Xiong, Min
    Zhang, Ge
    Kang, Wenjie
    Sun, Shen
    Wu, Shuicai
    SENSORS, 2023, 23 (04)
  • [26] Research of web classification mining based on RBF neural network
    Chen, JJ
    Huang, RB
    2004 8TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1-3, 2004, : 1365 - 1367
  • [27] Web document classification technique based on the adaptive neural network
    Lei, Jingsheng
    Zhong, Sheng
    Journal of Information and Computational Science, 2004, 1 (03): : 135 - 139
  • [28] Convolutional neural network-based data page classification for holographic memory
    Shimobaba, Tomoyoshi
    Kuwata, Naoki
    Homma, Mizuha
    Takahashi, Takayuki
    Nagahama, Yuki
    Sano, Marie
    Hasegawa, Satoki
    Hirayama, Ryuji
    Kakue, Takashi
    Shiraki, Atsushi
    Takada, Naoki
    Ito, Tomoyoshi
    APPLIED OPTICS, 2017, 56 (26) : 7327 - 7330
  • [29] Research of Web Classification Mining Based on Wavelet Neural Network
    Tian, Jingwen
    Gao, Meijuan
    FIRST IITA INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2009, : 559 - 562
  • [30] A neural network approach to web graph processing
    Tsoi, AC
    Scarselli, F
    Gori, M
    Hagenbuchner, M
    Yong, SL
    WEB TECHNOLOGIES RESEARCH AND DEVELOPMENT - APWEB 2005, 2005, 3399 : 27 - 38