LEARNING TO CLASSIFY TROPICAL DISEASE WEB PAGES FROM LARGE INDONESIAN WEB DOCUMENTS

被引:0
|
作者
Abidin, Taufik Fuadi [1 ]
Ferdhiana, Ridha [2 ]
Kamil, Hajjul [3 ]
机构
[1] Syiah Kuala Univ, Coll Sci, Dept Informat, Banda Aceh, Indonesia
[2] Syiah Kuala Univ, Coll Sci, Dept Math, Banda Aceh, Indonesia
[3] Syiah Kuala Univ, Coll Med, Dept Nursing, Banda Aceh, Indonesia
关键词
Web Classifier; Support Vector Machine; K-Nearest Neighbors; Naive Bayesian;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In the era of internet technology, many cases of tropical diseases like malaria, leprosy, and dengue fever are reported online. These online facts can be very useful to track the spread of the diseases. Studies in classifying tropical disease web pages from a large set of Indonesian web pages have not yet been recognized. In this paper, we built classifiers using Support Vector Machine (SVM), Naive Bayesian, and K-Nearest Neighbors. We generated dictionaries of n-gram terms for both positive (tropical disease) and negative (non-tropical disease) classes and used the dictionaries to extract feature attributes of the pages. The experimental results show that SVM with polynomial kernel is the best classifier model when compared to the other models and methods. The F-measure and accuracy of the model are 95.52% and 99.59% respectively.
引用
收藏
页码:347 / +
页数:2
相关论文
共 50 条
  • [1] Wrapping web pages into XML documents
    Fu, T
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 419 - 428
  • [2] Using linked data to classify web documents
    Fripp, Dominic
    [J]. ASLIB PROCEEDINGS, 2010, 62 (06): : 585 - 595
  • [3] Product ontology learning from web pages
    Fu Kui
    Nie Guihua
    [J]. PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON INNOVATION & MANAGEMENT, VOLS I AND II, 2007, : 1864 - 1867
  • [4] From Web Pages to Web Communities
    Kudelka, Milos
    Snasel, Vaclav
    Horak, Zdenek
    Hassanien, Aboul Ella
    [J]. DATESO 2009 - DATABASES, TEXTS, SPECIFICATIONS, OBJECTS: PROCEEDINGS OF THE 9TH ANNUAL INTERNATIONAL WORKSHOP, 2009, 471 : 13 - 22
  • [5] A Domain Ontology Learning from Web Documents
    Djaanfar, Ahmed Said
    Frikh, Bouchra
    Ouhbi, Brahim
    [J]. INTELLIGENT DISTRIBUTED COMPUTING IV, 2010, 315 : 201 - +
  • [6] Effects of collaboration and argumentation on learning from web pages
    Wiley, J
    Bailey, J
    [J]. COLLABORATIVE LEARNING, REASONING, AND TECHNOLOGY, 2006, : 297 - 321
  • [7] An automatic approach to classify web documents using a domain ontology
    Song, MH
    Lim, SY
    Park, SB
    Kang, DJ
    Lee, SJ
    [J]. PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2005, 3776 : 666 - 671
  • [8] A proposed multi criteria indexing and ranking model for documents and web pages on large scale data
    Attia, Mohamed
    Abdel-Fattah, Manal A.
    Khedr, Ayman E.
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (10) : 8702 - 8715
  • [9] Automated complementary association learning from web documents
    Nouri, Zhila
    Nematbakhsh, Mohammad Ali
    Khayyambashi, Mohammad Reza
    [J]. International Review on Computers and Software, 2009, 4 (06) : 672 - 683
  • [10] Learning object models from semistructured Web documents
    Ye, SR
    Chua, TS
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (03) : 334 - 349