Reducing the user labeling effort in effective high recall tasks by fine-tuning active learning

被引:3
|
作者
Dal Bianco, Guilherme [1 ]
Duarte, Denio [1 ]
Goncalves, Marcos Andre [2 ]
机构
[1] Univ Fed Fronteira Sul, Campus Chapeco, Chapeco, Brazil
[2] Univ Fed Minas Gerais, Dept Ciencia Comp, Belo Horizonte, Brazil
关键词
Information retrieval; Hire; Active learning; SSAR; Labeling process; Supervised classifier; SELECTION;
D O I
10.1007/s10844-022-00772-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High recall Information REtrieval (HIRE) aims at identifying only and (almost) all relevant documents for a given query. HIRE is paramount in applications such as systematic literature review, medicine, legal jurisprudence, among others. To address the HIRE goals, active learning methods have proven valuable in determining informative and non-redundant documents to reduce user effort for manual labeling. We propose a new active learning framework for the HIRE task. REVEAL-HIRE selects a very reduced set of documents to be labeled, significantly mitigating the user's effort. The proposed approach selects the most representative documents by exploiting a novel, specifically designed active learning strategy for HIRE, called REVEAL (RelEVant rulE-based Active Learning). REVEAL aims at selecting the maximum number of relevant documents for a given query based on discriminative rule-based patterns and a penalization factor. The method is applied to the top-ranked documents to choose the most informative ones to be labeled, a hard task due to data skewness - most documents are irrelevant for a given query. The enhanced active learning process is repeated incrementally until a stopping point is achieved, using REVEAL to identify the point in the process when relevant documents should stop to be sampled. Experimental results in several standard benchmark datasets (e.g. 20-Newsgroups, Trec Total Recall, and CLEF eHealth) demonstrate that REVEAL-HIRE can reduce the user labeling effort up to 3 times (320% of reduction) in comparison with state-of-the-art baselines while keeping the effectiveness at the highest levels.
引用
收藏
页码:453 / 472
页数:20
相关论文
共 26 条
  • [1] Reducing the user labeling effort in effective high recall tasks by fine-tuning active learning
    Guilherme Dal Bianco
    Denio Duarte
    Marcos André Gonçalves
    Journal of Intelligent Information Systems, 2023, 61 : 453 - 472
  • [2] Active Learning Methodology in LLMs Fine-tuning
    Ceravolo, Paolo
    Mohammadi, Fatemeh
    Tamborini, Marta Annamaria
    2024 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE, CSR, 2024, : 743 - 749
  • [3] Active Learning for Effectively Fine-Tuning Transfer Learning to Downstream Task
    Abul Bashar, Md
    Nayak, Richi
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2021, 12 (02)
  • [4] How Fine-Tuning Allows for Effective Meta-Learning
    Chua, Kurtland
    Lei, Qi
    Lee, Jason D.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [5] Medical Image Grading Learning Based on Active and Incremental Fine-Tuning
    Su, Zhuo
    Hu, Jiwei
    Liu, Quan
    Deng, Jiamei
    ELEVENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2019), 2019, 11179
  • [6] SciDeBERTa: Learning DeBERTa for Science Technology Documents and Fine-Tuning Information Extraction Tasks
    Jeong, Yuna
    Kim, Eunhui
    IEEE ACCESS, 2022, 10 : 60805 - 60813
  • [7] High Accuracy Arrhythmia Classification using Transfer Learning with Fine-Tuning
    Aphale, Sayli
    Jha, Anshul
    John, Eugene
    2022 IEEE 13TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON), 2022, : 480 - 487
  • [8] LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks
    Dinh, Tuan
    Zeng, Yuchen
    Zhang, Ruisu
    Lin, Ziqian
    Gira, Michael
    Rajput, Shashank
    Sohn, Jy-Yong
    Papailiopoulos, Dimitris
    Lee, Kangwook
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [9] An Effective Ensemble Convolutional Learning Model with Fine-Tuning for Medicinal Plant Leaf Identification
    Hajam, Mohd Asif
    Arif, Tasleem
    Khanday, Akib Mohi Ud Din
    Neshat, Mehdi
    INFORMATION, 2023, 14 (11)
  • [10] Pre-training Fine-tuning data Enhancement method based on active learning
    Cao, Deqi
    Ding, Zhaoyun
    Wang, Fei
    Ma, Haoyang
    2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 1447 - 1454