Concept-Based Document Classification Using Wikipedia and Value Function

被引:8
|
作者
Malo, Pekka [1 ]
Sinha, Ankur [1 ]
Wallenius, Jyrki [1 ]
Korhonen, Pekka [1 ]
机构
[1] Aalto Univ, Sch Econ, Dept Business Technol, FI-00076 Aalto, Finland
关键词
MULTIOBJECTIVE EVOLUTIONARY ALGORITHMS; INFORMATION-RETRIEVAL; PROGRESSIVE ALGORITHM; BOOLEAN QUERIES; ONTOLOGY; SEARCH; SYSTEM;
D O I
10.1002/asi.21596
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we propose a new concept-based method for document classification. The conceptual knowledge associated with the words is drawn from Wikipedia. The purpose is to utilize the abundant semantic relatedness information available in Wikipedia in an efficient value function-based query learning algorithm. The procedure learns the value function by solving a simple linear programming problem formulated using the training documents. The learning involves a step-wise iterative process that helps in generating a value function with an appropriate set of concepts (dimensions) chosen from a collection of concepts. Once the value function is formulated, it is utilized to make a decision between relevance and irrelevance. The value assigned to a particular document from the value function can be further used to rank the documents according to their relevance. Reuters newswire documents have been used to evaluate the efficacy of the procedure. An extensive comparison with other frameworks has been performed. The results are promising.
引用
收藏
页码:2496 / 2511
页数:16
相关论文
共 50 条
  • [31] A Concept-based Integer Linear Programming Approach for Single-Document Summarization
    Oliveira, Hilario
    Lima, Rinaldo
    Lins, Rafael Dueire
    Freitas, Fred
    Riss, Marcelo
    Simske, Steven J.
    [J]. PROCEEDINGS OF 2016 5TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS 2016), 2016, : 403 - 408
  • [32] Wikipedia-based hybrid document representation for textual news classification
    Marcos Antonio Mouriño-García
    Roberto Pérez-Rodríguez
    Luis Anido-Rifón
    Manuel Vilares-Ferro
    [J]. Soft Computing, 2018, 22 : 6047 - 6065
  • [33] Wikipedia-based hybrid document representation for textual news classification
    Antonio Mourino-Garcia, Marcos
    Perez-Rodriguez, Roberto
    Anido-Rifon, Luis
    Vilares-Ferro, Manuel
    [J]. SOFT COMPUTING, 2018, 22 (18) : 6047 - 6065
  • [34] Wikipedia-Based Hybrid Document Representation for Textual News Classification
    Mourino Garcia, Marcos Antonio
    Perez Rodriguez, Roberto
    Anido Rifon, Luis
    Vilares Ferro, Manuel
    [J]. 2016 3RD INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE (ISCMI 2016), 2016, : 148 - 153
  • [35] Concept-based pages recommendation by using cluster algorithm
    Chi, Chen-Chung
    Kuo, Chin-Hwa
    Lu, Ming-Yuan
    Tsao, Nai-Lung
    [J]. 8TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES, PROCEEDINGS, 2008, : 298 - 300
  • [36] Concept-based clustering of textual documents using SOM
    Amine, Abdehmalek
    Elberrichi, Zakaria
    Bellatreche, Ladjel
    Simonet, Michel
    Malki, Mimoun
    [J]. 2008 IEEE/ACS INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, VOLS 1-3, 2008, : 156 - +
  • [37] Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma
    Mithun, Sneha
    Jha, Ashish Kumar
    Sherkhane, Umesh B.
    Jaiswar, Vinay
    Purandare, Nilendu C.
    Dekker, Andre
    Puts, Sander
    Bermejo, Inigo
    Rangarajan, V.
    Zegers, Catharina M. L.
    Wee, Leonard
    [J]. JOURNAL OF DIGITAL IMAGING, 2023, 36 (03) : 812 - 826
  • [38] A Concept-Based Image Acquisition System with User-Driven Classification
    Sotiropoulos, D. N.
    Lampropoulos, A. S.
    Tsihrintzis, G. A.
    [J]. KNOWLEDGE-BASED SOFTWARE ENGINEERING, 2012, 240 : 53 - 60
  • [39] Concept-based text mining technique for semantic classification of manufacturing suppliers
    [J]. Ameri, F. (ameri@txstate.edu), 2017, ASTM International (01):
  • [40] Using a concept-based user context for search personalization
    Daoud, Mariam
    Tamine-Lechani, Lynda
    Boughanem, Mohand
    [J]. WORLD CONGRESS ON ENGINEERING 2008, VOLS I-II, 2008, : 293 - 298