On Text-based Mining with Active Learning and Background Knowledge Using SVM

被引:0
|
作者
Catarina Silva
Bernardete Ribeiro
机构
[1] Informática – Universidade de Coimbra,CISUC – Departamento de Engenharia
[2] Instituto Politécnico de Leiria,Escola Superior de Tecnologia e Gestão
来源
Soft Computing | 2007年 / 11卷
关键词
Text mining; Partially labeled data; Support vector machines;
D O I
暂无
中图分类号
学科分类号
摘要
Text mining, intelligent text analysis, text data mining and knowledge-discovery in text are generally used aliases to the process of extracting relevant and non-trivial information from text. Some crucial issues arise when trying to solve this problem, such as document representation and deficit of labeled data. This paper addresses these problems by introducing information from unlabeled documents in the training set, using the support vector machine (SVM) separating margin as the differentiating factor. Besides studying the influence of several pre-processing methods and concluding on their relative significance, we also evaluate the benefits of introducing background knowledge in a SVM text classifier. We further evaluate the possibility of actively learning and propose a method for successfully combining background knowledge and active learning. Experimental results show that the proposed techniques, when used alone or combined, present a considerable improvement in classification performance, even when small labeled training sets are available.
引用
收藏
页码:519 / 530
页数:11
相关论文
共 50 条
  • [1] On text-based mining with active learning and background knowledge using SVM
    Silva, Catarina
    Ribeiro, Bernardete
    SOFT COMPUTING, 2007, 11 (06) : 519 - 530
  • [2] Margin-based active learning and background knowledge in text mining
    Silva, C. (catarina@dei.uc.pt), IEEE Computational Intelligence Society; IEEE Systems, Man and Cybernetics; International Fuzzy Systems Association (IEEE Computer Society):
  • [3] Margin-based active learning and background knowledge in text mining
    Silva, C
    Ribeiro, B
    HIS'04: FOURTH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS, PROCEEDINGS, 2005, : 8 - 13
  • [4] Prioritized Active Learning for Malicious URL Detection using Weighted Text-Based Features
    Das Bhattacharjee, Sreyasee
    Talukder, Ashit
    Al-Shaer, Ehab
    Doshi, Pratik
    2017 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2017, : 107 - 112
  • [5] Text-based knowledge discovery: search and mining of life-sciences documents
    Mack, R
    Hehenberger, M
    DRUG DISCOVERY TODAY, 2002, 7 (11) : S89 - S98
  • [6] Monitoring knowledge - A text-based approach
    Schierz, Amanda C.
    TERMINOLOGY, 2007, 13 (02): : 125 - 154
  • [7] A Novel Active Learning Method Using SVM for Text Classification
    Goudjil M.
    Koudil M.
    Bedda M.
    Ghoggali N.
    International Journal of Automation and Computing, 2018, 15 (03) : 290 - 298
  • [8] Active learning based on SVM and representativity in a coal mining environment
    Tengfei Su
    Shengwei Zhang
    Tingxi Liu
    Earth Science Informatics, 2022, 15 : 1115 - 1135
  • [9] Active learning based on SVM and representativity in a coal mining environment
    Su, Tengfei
    Zhang, Shengwei
    Liu, Tingxi
    EARTH SCIENCE INFORMATICS, 2022, 15 (02) : 1115 - 1135
  • [10] Text-Based Emotion Recognition Using Deep Learning Approach
    Bharti, Santosh Kumar
    Varadhaganapathy, S.
    Gupta, Rajeev Kumar
    Shukla, Prashant Kumar
    Bouye, Mohamed
    Hingaa, Simon Karanja
    Mahmoud, Amena
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022