Semi-supervised text classification using positive and unlabeled data

被引:0
|
作者
Yu, Shuang [1 ]
Zhou, Xueyuan [1 ]
Li, Chunping [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China
关键词
text classification; positive and unlabeled data; graph-based method;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text classification using positive and unlabeled data refers to the problem of building text classifier using positive documents (P) of one class and unlabeled documents (U) of many other classes. U consists of positive and negative documents. Some existing methods for solving the PU-Learning problem are building a classifier in a two-step process. Generally speaking, these existing methods do not perform well when the size of P is too small. In this paper, we propose an improved method aiming at solving the PU-Learning problem with small P. This method combines the graph-based semi-supervised learning with the two-step method. Experiment indicates that our improved method performs well when the size of P is small.
引用
收藏
页码:249 / 254
页数:6
相关论文
共 50 条
  • [1] Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data
    Sakai, Tomoya
    du Plessis, Marthinus Christoffel
    Niu, Gang
    Sugiyama, Masashi
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [2] Semi-supervised Text Categorization with Only a Few Positive and Unlabeled Documents
    Lu, Fang
    Bai, Qingyuan
    [J]. 2010 3RD INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS (BMEI 2010), VOLS 1-7, 2010, : 3075 - 3079
  • [3] Semi-supervised Text Classification from Unlabeled Documents Using Class Associated Words
    Han Hong-qi
    Zhu Dong-hua
    Wang Xue-feng
    [J]. CIE: 2009 INTERNATIONAL CONFERENCE ON COMPUTERS AND INDUSTRIAL ENGINEERING, VOLS 1-3, 2009, : 1255 - 1260
  • [4] Semi-supervised Learning from Only Positive and Unlabeled Data Using Entropy
    Wang, Xiaoling
    Xu, Zhen
    Sha, Chaofeng
    Ester, Martin
    Zhou, Aoying
    [J]. WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2010, 6184 : 668 - +
  • [5] Semi-supervised support vector machines for unlabeled data classification
    Fung, G
    Mangasarian, OL
    [J]. OPTIMIZATION METHODS & SOFTWARE, 2001, 15 (01): : 29 - 44
  • [6] Semi-supervised text categorization: Exploiting unlabeled data using ensemble learning algorithms
    Keyvanpour, Mohammad Reza
    Imani, Maryam Bahojb
    [J]. INTELLIGENT DATA ANALYSIS, 2013, 17 (03) : 367 - 385
  • [7] Text Classification Using Semi-Supervised Clustering
    Zhang, Wen
    Yoshida, Taketoshi
    Tang, Xijin
    [J]. 2009 INTERNATIONAL CONFERENCE ON BUSINESS INTELLIGENCE AND FINANCIAL ENGINEERING, PROCEEDINGS, 2009, : 197 - 200
  • [8] POSITIVE UNLABELED LEARNING BY SEMI-SUPERVISED LEARNING
    Wang, Zhuowei
    Jiang, Jing
    Long, Guodong
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2976 - 2980
  • [9] Metric Learning Using Labeled and Unlabeled Data for Semi-Supervised/Domain Adaptation Classification
    Benisty, Hadas
    Crammer, Koby
    [J]. 2014 IEEE 28TH CONVENTION OF ELECTRICAL & ELECTRONICS ENGINEERS IN ISRAEL (IEEEI), 2014,
  • [10] Semi-supervised Object Detection with Unlabeled Data
    Nhu-Van Nguyen
    Rigaud, Christophe
    Burie, Jean-Christophe
    [J]. PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2019, : 289 - 296