Extracting classification knowledge of Internet documents with mining term associations: A semantic approach

被引:0
|
作者
Natl Cheng Kung Univ, Tainan, Taiwan [1 ]
机构
来源
SIGIR Forum | / 241-249期
关键词
Algorithms - Computational linguistics - Data mining - Feature extraction - Hierarchical systems - Inference engines - Internet - Polynomials - Search engines;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, we present a system that extracts and generalizes terms from Internet documents to represent classification knowledge of a given class hierarchy. We propose a measurement to evaluate the importance of a term with respect to a class in the class hierarchy, and denote it as support. With a given threshold, terms with high supports are sifted as keywords of a class, and terms with low supports are filtered out. To further enhance the recall of this approach, Mining Association Rules technique is applied to mine the association between terms. An inference model is composed of these association relations and the previously computed supports of the terms in the class. To increase the recall rate of the keyword selection process, we then present a polynomial-time inference algorithm to promote a term, strongly associated to a known keyword, to a keyword. According to our experiment results on the collected Internet documents from Yam search engine, we show that the proposed methods in the paper contribute to refine the classification knowledge and increase the recall of keyword selection.
引用
收藏
相关论文
共 48 条
  • [21] Using Semantic Role Knowledge for Relevance Ranking of Key Phrases in Documents: An Unsupervised Approach
    Gantayat, Neelamadhav
    Mohapatra, Prateeti
    PROCEEDINGS OF THE 5TH JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE & MANAGEMENT OF DATA, CODS COMAD 2022, 2022, : 120 - 124
  • [22] Semantic similarity measurement based on knowledge mining: an artificial neural net approach
    Li, Wenwen
    Raskin, Robert
    Goodchild, Michael F.
    INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2012, 26 (08) : 1415 - 1435
  • [23] Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach
    Wei, Yunchao
    Feng, Jiashi
    Liang, Xiaodan
    Cheng, Ming-Ming
    Zhao, Yao
    Yan, Shuicheng
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6488 - 6496
  • [24] Versioned linking of semantic enrichment of legal documents Emerald: an implementation of knowledge-based services in a semantic web approach
    Szoke, Akos
    Forhecz, Andras
    Korosi, Gabor
    Strausz, Gyorgy
    ARTIFICIAL INTELLIGENCE AND LAW, 2013, 21 (04) : 485 - 519
  • [25] A DATA MINING APPROACH TO ASSIST DESIGN KNOWLEDGE RETRIEVAL BASED ON KEYWORD ASSOCIATIONS
    Shi, F.
    Han, J.
    Childs, P. R. N.
    DS 84: PROCEEDINGS OF THE DESIGN 2016 14TH INTERNATIONAL DESIGN CONFERENCE, VOLS 1-4, 2016, : 1125 - 1134
  • [26] A hybrid data mining approach for knowledge extraction and classification in medical databases
    Hassan, Syed Zahid
    Verma, Brijesh
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2007, : 503 - 508
  • [27] A semantic-based approach for mining undiscovered public knowledge from biomedical literature
    Hu, XH
    Li, GR
    Yoo, I
    Zhang, XD
    Xu, XH
    2005 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, 2005, : 22 - 27
  • [28] Extracting Features from Text Flows based on Semantic Similarity for Text Classification: an Approach Inspired by Audio Analysis
    Vasconcelos, Larissa Lucena
    Campelo, Claudio E. C.
    Journal of the Brazilian Computer Society, 2024, 30 (01) : 297 - 314
  • [29] Linked knowledge sources for topic classification of microposts: A semantic graph-based approach
    Varga, Andrea
    Basave, Amparo Elizabeth Cano
    Rowe, Matthew
    Ciravegna, Fabio
    He, Yulan
    JOURNAL OF WEB SEMANTICS, 2014, 26 : 36 - 57
  • [30] A semantic approach for document classification using deep neural networks and multimedia knowledge graph
    Rinaldi, Antonio M.
    Russo, Cristiano
    Tommasino, Cristian
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 169