On Text-based Mining with Active Learning and Background Knowledge Using SVM

被引:0
|
作者
Catarina Silva
Bernardete Ribeiro
机构
[1] Informática – Universidade de Coimbra,CISUC – Departamento de Engenharia
[2] Instituto Politécnico de Leiria,Escola Superior de Tecnologia e Gestão
来源
Soft Computing | 2007年 / 11卷
关键词
Text mining; Partially labeled data; Support vector machines;
D O I
暂无
中图分类号
学科分类号
摘要
Text mining, intelligent text analysis, text data mining and knowledge-discovery in text are generally used aliases to the process of extracting relevant and non-trivial information from text. Some crucial issues arise when trying to solve this problem, such as document representation and deficit of labeled data. This paper addresses these problems by introducing information from unlabeled documents in the training set, using the support vector machine (SVM) separating margin as the differentiating factor. Besides studying the influence of several pre-processing methods and concluding on their relative significance, we also evaluate the benefits of introducing background knowledge in a SVM text classifier. We further evaluate the possibility of actively learning and propose a method for successfully combining background knowledge and active learning. Experimental results show that the proposed techniques, when used alone or combined, present a considerable improvement in classification performance, even when small labeled training sets are available.
引用
收藏
页码:519 / 530
页数:11
相关论文
共 50 条
  • [41] Unsupervised Descriptive Text Mining for Knowledge Graph Learning
    Frisoni, Giacomo
    Moro, Gianluca
    Carbonaro, Antonella
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KDIR), VOL 1, 2020, : 316 - 324
  • [42] Superior Gain in Knowledge by Podcasts Versus Text-Based Learning in Teaching Orthopedics: A Randomized Controlled Trial
    Back, David Alexander
    von Malotky, Jennifer
    Sostmann, Kai
    Hube, Robert
    Peters, Harm
    Hoff, Eike
    JOURNAL OF SURGICAL EDUCATION, 2017, 74 (01) : 154 - 160
  • [43] A Representation Method for Cellular Lines based on SVM and Text Mining
    Carrera, Ivan
    Dutra, Ines
    Tejera, Eduardo
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 2717 - 2723
  • [44] CSKE: Commonsense Knowledge Enhanced Text Extension Framework for Text-Based Logical Reasoning
    Zeng, Yirong
    Ding, Xiao
    Du, Li
    Liu, Ting
    Qin, Bing
    KNOWLEDGE GRAPH AND SEMANTIC COMPUTING: KNOWLEDGE GRAPH EMPOWERS THE DIGITAL ECONOMY, CCKS 2022, 2022, 1669 : 111 - 122
  • [45] Stemming Text-based Web Page Classification using Machine Learning Algorithms: A Comparison
    Razali, Ansari
    Daud, Salwani Mohd
    Zin, Nor Azan Mat
    Shahidi, Faezehsadat
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (01) : 570 - 576
  • [46] Constructing Knowledge Using Exploratory Text Mining
    Otsuka, Naoya
    Matsushita, Mitsunori
    2014 JOINT 7TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 15TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2014, : 1392 - 1397
  • [47] A Comparative Analysis of Active Learning for Biomedical Text Mining
    Naseem, Usman
    Khushi, Matloob
    Khan, Shah Khalid
    Shaukat, Kamran
    Moni, Mohammad Ali
    APPLIED SYSTEM INNOVATION, 2021, 4 (01)
  • [48] Conditional Feature Learning Based Transformer for Text-Based Person Search
    Gao, Chenyang
    Cai, Guanyu
    Jiang, Xinyang
    Zheng, Feng
    Zhang, Jun
    Gong, Yifei
    Lin, Fangzhou
    Sun, Xing
    Bai, Xiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6097 - 6108
  • [49] Adaptive Uncertainty-Based Learning for Text-Based Person Retrieval
    Li, Shenshen
    He, Chen
    Xu, Xing
    Shen, Fumin
    Yang, Yang
    Shen, Heng Tao
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 3172 - 3180
  • [50] Learning Dynamic Belief Graphs to Generalize on Text-Based Games
    Adhikari, Ashutosh
    Yuan, Xingdi
    Cote, Marc-Alexandre
    Zelinka, Mikulas
    Rondeau, Marc-Antoine
    Laroche, Romain
    Poupart, Pascal
    Tang, Jian
    Trischler, Adam
    Hamilton, William L.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33