Learning from Positive and Unlabeled Data with Arbitrary Positive Shift

被引:0
|
作者
Hammoudeh, Zayd [1 ]
Lowd, Daniel [1 ]
机构
[1] Univ Oregon, Dept Comp & Informat Sci, Eugene, OR 97403 USA
关键词
COVARIATE SHIFT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Positive-unlabeled (PU) learning trains a binary classifier using only positive and unlabeled data. A common simplifying assumption is that the positive data is representative of the target positive class. This assumption rarely holds in practice due to temporal drift, domain shift, and/or adversarial manipulation. This paper shows that PU learning is possible even with arbitrarily non-representative positive data given unlabeled data from the source and target distributions. Our key insight is that only the negative class's distribution need be fixed. We integrate this into two statistically consistent methods to address arbitrary positive bias - one approach combines negative-unlabeled learning with unlabeled-unlabeled learning while the other uses a novel, recursive risk estimator. Experimental results demonstrate our methods' effectiveness across numerous real-world datasets and forms of positive bias, including disjoint positive class-conditional supports. Additionally, we propose a general, simplified approach to address PU risk estimation overfitting.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Global and local learning from positive and unlabeled examples
    Ke, Ting
    Jing, Ling
    Lv, Hui
    Zhang, Lidong
    Hu, Yaping
    [J]. APPLIED INTELLIGENCE, 2018, 48 (08) : 2373 - 2392
  • [32] Phonocardiogram Classification by Learning From Positive and Unlabeled Examples
    Nehary, Ebrahim A.
    Rajan, Sreeraman
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 14
  • [33] Classification from Positive, Unlabeled and Biased Negative Data
    Hsieh, Yu-Guan
    Niu, Gang
    Sugiyama, Masashi
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [34] Class Prior Estimation from Positive and Unlabeled Data
    Du Plessis, Marthinus Christoffel
    Sugiyama, Masashi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (05): : 1358 - 1362
  • [35] CLASSIFICATION FROM ONLY POSITIVE AND UNLABELED FUNCTIONAL DATA
    Terada, Yoshikazu
    Ogasawara, Issei
    Nakata, Ken
    [J]. ANNALS OF APPLIED STATISTICS, 2020, 14 (04): : 1724 - 1742
  • [36] Conditional generative positive and unlabeled learning
    Papic, Ales
    Kononenko, Igor
    Bosnic, Zoran
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 224
  • [37] Efficient Training for Positive Unlabeled Learning
    Sansone, Emanuele
    De Natale, Francesco G. B.
    Zhou, Zhi-Hua
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (11) : 2584 - 2598
  • [38] Bayesian Classifiers for Positive Unlabeled Learning
    He, Jiazhen
    Zhang, Yang
    Li, Xue
    Wang, Yong
    [J]. WEB-AGE INFORMATION MANAGEMENT, 2011, 6897 : 81 - +
  • [39] On Positive and Unlabeled Learning for Text Classification
    Nagy, Istvan T.
    Farkas, Richard
    Csirik, Janos
    [J]. TEXT, SPEECH AND DIALOGUE, TSD 2011, 2011, 6836 : 219 - 226
  • [40] Positive and unlabeled examples help learning
    De Comité, F
    Denis, F
    Gilleron, R
    Letouzey, F
    [J]. ALGORITHMIC LEARNING THEORY, PROCEEDINGS, 1999, 1720 : 219 - 230