Active Learning With Sampling by Uncertainty and Density for Data Annotations

被引:87
|
作者
Zhu, Jingbo [1 ,2 ]
Wang, Huizhen [1 ,2 ]
Tsou, Benjamin K. [3 ]
Ma, Matthew [4 ]
机构
[1] Northeastern Univ, Minist Educ, Key Lab Med Image Comp, Shenyang 110004, Peoples R China
[2] Northeastern Univ, Nat Language Proc Lab, Shenyang 110004, Peoples R China
[3] City Univ Hong Kong, Language Informat Sci Res Ctr, Hong Kong, Hong Kong, Peoples R China
[4] Sci Works, Princeton Jct, NJ 08550 USA
基金
美国国家科学基金会;
关键词
Active learning; density-based re-ranking; sampling by uncertainty and density; text classification; uncertainty sampling; word sense disambiguation (WSD);
D O I
10.1109/TASL.2009.2033421
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
To solve the knowledge bottleneck problem, active learning has been widely used for its ability to automatically select the most informative unlabeled examples for human annotation. One of the key enabling techniques of active learning is uncertainty sampling, which uses one classifier to identify unlabeled examples with the least confidence. Uncertainty sampling often presents problems when outliers are selected. To solve the outlier problem, this paper presents two techniques, sampling by uncertainty and density (SUD) and density-based re-ranking. Both techniques prefer not only the most informative example in terms of uncertainty criterion, but also the most representative example in terms of density criterion. Experimental results of active learning for word sense disambiguation and text classification tasks using six real-world evaluation data sets demonstrate the effectiveness of the proposed methods.
引用
收藏
页码:1323 / 1331
页数:9
相关论文
共 50 条
  • [1] Convergence of Uncertainty Sampling for Active Learning
    Raj, Anant
    Bach, Francis
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [2] How to measure uncertainty in uncertainty sampling for active learning
    Vu-Linh Nguyen
    Mohammad Hossein Shaker
    Eyke Hüllermeier
    [J]. Machine Learning, 2022, 111 : 89 - 122
  • [3] How to measure uncertainty in uncertainty sampling for active learning
    Nguyen, Vu-Linh
    Shaker, Mohammad Hossein
    Huellermeier, Eyke
    [J]. MACHINE LEARNING, 2022, 111 (01) : 89 - 122
  • [4] A Density-Based Re-ranking Technique for Active Learning for Data Annotations
    Zhu, Jingbo
    Wang, Huizhen
    Tsou, Benjamin K.
    [J]. COMPUTER PROCESSING OF ORIENTAL LANGUAGES: LANGUAGE TECHNOLOGY FOR THE KNOWLEDGE-BASED ECONOMY, 2009, 5459 : 1 - +
  • [5] An Active Learning Based on Uncertainty and Density Method for Positive and Unlabeled Data
    Luo, Jun
    Zhou, Wenan
    Du, Yu
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2018, PT I, 2018, 11334 : 229 - 241
  • [6] Evidential uncertainty sampling strategies for active learning
    Hoarau, Arthur
    Lemaire, Vincent
    Le Gall, Yolande
    Dubois, Jean-Christophe
    Martin, Arnaud
    [J]. MACHINE LEARNING, 2024, 113 (09) : 6453 - 6474
  • [7] Evidence-based uncertainty sampling for active learning
    Manali Sharma
    Mustafa Bilgic
    [J]. Data Mining and Knowledge Discovery, 2017, 31 : 164 - 202
  • [8] Evidence-based uncertainty sampling for active learning
    Sharma, Manali
    Bilgic, Mustafa
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2017, 31 (01) : 164 - 202
  • [9] ON THE EFFECTIVENESS OF ACTIVE LEARNING BY UNCERTAINTY SAMPLING IN CLASSIFICATION OF HIGH-DIMENSIONAL GAUSSIAN MIXTURE DATA
    Mai, Xiaoyi
    Avestimehr, Salman
    Ortega, Antonio
    Soltanolkotabi, Mahdi
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4238 - 4242
  • [10] Image Sequence Recognition with Active Learning Using Uncertainty Sampling
    Minakawa, Masatoshi
    Raytchev, Bisser
    Tamaki, Toru
    Kaneda, Kazufumi
    [J]. 2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,