Extracting classification knowledge of Internet documents with mining term associations: A semantic approach

被引:0
|
作者
Natl Cheng Kung Univ, Tainan, Taiwan [1 ]
机构
来源
SIGIR Forum | / 241-249期
关键词
Algorithms - Computational linguistics - Data mining - Feature extraction - Hierarchical systems - Inference engines - Internet - Polynomials - Search engines;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, we present a system that extracts and generalizes terms from Internet documents to represent classification knowledge of a given class hierarchy. We propose a measurement to evaluate the importance of a term with respect to a class in the class hierarchy, and denote it as support. With a given threshold, terms with high supports are sifted as keywords of a class, and terms with low supports are filtered out. To further enhance the recall of this approach, Mining Association Rules technique is applied to mine the association between terms. An inference model is composed of these association relations and the previously computed supports of the terms in the class. To increase the recall rate of the keyword selection process, we then present a polynomial-time inference algorithm to promote a term, strongly associated to a known keyword, to a keyword. According to our experiment results on the collected Internet documents from Yam search engine, we show that the proposed methods in the paper contribute to refine the classification knowledge and increase the recall of keyword selection.
引用
收藏
相关论文
共 48 条
  • [1] A linguistic and statistical approach for extracting knowledge from documents
    Sado, WN
    Fontaine, D
    Fontaine, P
    15TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2004, : 454 - 458
  • [2] Semantic knowledge mining for human motion classification
    Jin, B. (jinbo@dlut.edu.cn), 1600, Binary Information Press, Flat F 8th Floor, Block 3, Tanner Garden, 18 Tanner Road, Hong Kong (09):
  • [3] Mining documents for complex semantic relations by the use of context classification
    Schmidt, A
    Junker, M
    DOCUMENT ANALYSIS SYSTEM V, PROCEEDINGS, 2002, 2423 : 400 - 411
  • [4] A Hybrid Approach for Measuring Semantic Similarity between Documents and its Application in Mining the Knowledge Repositories
    Sumathy, K. L.
    Dr Chidambaram
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (08) : 231 - 237
  • [5] IP Mining: Extracting Knowledge from the Dynamics of the Internet Addressing Space
    Casas, Pedro
    Fiadino, Pierdomenico
    Baer, Arian
    2013 25TH INTERNATIONAL TELETRAFFIC CONGRESS (ITC), 2013,
  • [6] Extracting significant Website Key Objects: A Semantic Web mining approach
    Velasquez, Juan D.
    Dujovne, Luis E.
    L'Huillier, Gaston
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2011, 24 (08) : 1532 - 1541
  • [7] Sentiment classification: a lexical similarity based approach for extracting subjectivity in documents
    Kiran Sarvabhotla
    Prasad Pingali
    Vasudeva Varma
    Information Retrieval, 2011, 14 : 337 - 353
  • [8] Sentiment classification: a lexical similarity based approach for extracting subjectivity in documents
    Sarvabhotla, Kiran
    Pingali, Prasad
    Varma, Vasudeva
    INFORMATION RETRIEVAL, 2011, 14 (03): : 337 - 353
  • [9] Extracting semantic concepts from images: a decisive feature pattern mining approach
    Wei Wang
    Aidong Zhang
    Multimedia Systems, 2006, 11 : 352 - 366
  • [10] Extracting semantic concepts from images: a decisive feature pattern mining approach
    Wang, W
    Zhang, AD
    MULTIMEDIA SYSTEMS, 2006, 11 (04) : 352 - 366