TEXT CONTENT ANALYSIS FOR ILLICIT WEB PAGES BY USING NEURAL NETWORKS

被引:0
|
作者
Sam, Lee Zhi [1 ]
Maarof, Mohd Aizaini [1 ]
Selamat, Ali [1 ]
Shamsuddin, Siti Mariyam [1 ]
机构
[1] Univ Teknol Malaysia, Fac Comp Sci & Informat Syst FSKSM, Skudai 81310, Johor, Malaysia
来源
JURNAL TEKNOLOGI | 2009年 / 50卷
关键词
Artificial neural network; term weighting scheme; textual content analysis; web pages classification;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Illicit web contents such as pornography, violence, and gambling have greatly polluted the mind of web users especially children and teenagers. Due to the ineffectiveness of some popular web filtering techniques like Uniform Resource Locator (URL) blocking and Platform for Internet Content Selection (PICS) checking against today's dynamic web contents, content based analysis techniques with effective model are highly desired. In this paper, we have proposed a textual content analysis model using entropy term weighting scheme to classify pornography and sex education web pages. We have examined the entropy scheme with two other common term weighting schemes that are TFIDF and Glasgow. Those techniques have been tested with artificial neural network using small class dataset. In this study, we found that our proposed model has achieved better performance in terms accuracy, convergence speed, and stability compared to the other techniques.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Features Extraction for Illicit Web Pages Identifications Using Independent Component Analysis
    Sam, Lee Zhi
    bin Maarof, Mohd Aizaini
    Selamat, Ali
    Shamsuddin, Siti Mariyam
    ICIAS 2007: INTERNATIONAL CONFERENCE ON INTELLIGENT & ADVANCED SYSTEMS, VOLS 1-3, PROCEEDINGS, 2007, : 139 - 144
  • [2] AGGRESSIVNESS AND AUTOAGRESSIVNESS IN THE WEB TEXT: CONTENT ANALYSIS OF PAGES IN THE SOCIAL NETWORK "VKONTAKTE"
    Malahaeva, Svetlana K.
    Potapov, Roman S.
    THEORETICAL AND PRACTICAL ISSUES OF JOURNALISM, 2018, 7 (04): : 724 - 740
  • [3] Improving the web text content by extracting significant pages into a Web Site
    Ríos, SA
    Velásquez, JD
    Vera, ES
    Yasuda, H
    Aoki, T
    5th International Conference on Intelligent Systems Design and Applications, Proceedings, 2005, : 32 - 36
  • [4] Graph neural networks for ranking web pages
    Scarselli, F
    Yong, SL
    Gori, M
    Hagenbuchner, M
    Tsoi, AC
    Maggini, M
    2005 IEEE/WIC/ACM International Conference on Web Intelligence, Proceedings, 2005, : 666 - 672
  • [5] Categorizing Web pages on the subject of neural networks
    Vlajic, N
    Card, HC
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 1998, 21 (02) : 91 - 105
  • [6] Categorizing Web pages on the subject of neural networks
    Internet Innovation Centre, Dept. of Elec. and Comp. Engineering, University of Manitoba, Winnipeg, Man. R3T 5V6, Canada
    不详
    不详
    J Network Comput Appl, 2 (91-105):
  • [7] Determining the titles of Web pages using anchor text and link analysis
    Jeong, Ok-Ran
    Oh, Jehwan
    Kim, Dong-Jin
    Lyu, Heetae
    Kim, Won
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (09) : 4322 - 4329
  • [8] Separate But Equal? A Comparison of Content on Library Web Pages and Their Text Versions
    Hazard, Brenda L.
    JOURNAL OF WEB LIBRARIANSHIP, 2008, 2 (2-3) : 417 - 428
  • [9] Aesthetic evaluation of web pages using texture/color features and artificial neural networks
    Mirdehghani, Maryam
    Amirhassan Monadjemi, S.
    International Review on Computers and Software, 2009, 4 (01) : 34 - 41
  • [10] Neural networks for Web content filtering
    Lee, PY
    Hui, SC
    Fong, ACM
    IEEE INTELLIGENT SYSTEMS, 2002, 17 (05) : 48 - 57