Reducing the user labeling effort in effective high recall tasks by fine-tuning active learning

被引:3
|
作者
Dal Bianco, Guilherme [1 ]
Duarte, Denio [1 ]
Goncalves, Marcos Andre [2 ]
机构
[1] Univ Fed Fronteira Sul, Campus Chapeco, Chapeco, Brazil
[2] Univ Fed Minas Gerais, Dept Ciencia Comp, Belo Horizonte, Brazil
关键词
Information retrieval; Hire; Active learning; SSAR; Labeling process; Supervised classifier; SELECTION;
D O I
10.1007/s10844-022-00772-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High recall Information REtrieval (HIRE) aims at identifying only and (almost) all relevant documents for a given query. HIRE is paramount in applications such as systematic literature review, medicine, legal jurisprudence, among others. To address the HIRE goals, active learning methods have proven valuable in determining informative and non-redundant documents to reduce user effort for manual labeling. We propose a new active learning framework for the HIRE task. REVEAL-HIRE selects a very reduced set of documents to be labeled, significantly mitigating the user's effort. The proposed approach selects the most representative documents by exploiting a novel, specifically designed active learning strategy for HIRE, called REVEAL (RelEVant rulE-based Active Learning). REVEAL aims at selecting the maximum number of relevant documents for a given query based on discriminative rule-based patterns and a penalization factor. The method is applied to the top-ranked documents to choose the most informative ones to be labeled, a hard task due to data skewness - most documents are irrelevant for a given query. The enhanced active learning process is repeated incrementally until a stopping point is achieved, using REVEAL to identify the point in the process when relevant documents should stop to be sampled. Experimental results in several standard benchmark datasets (e.g. 20-Newsgroups, Trec Total Recall, and CLEF eHealth) demonstrate that REVEAL-HIRE can reduce the user labeling effort up to 3 times (320% of reduction) in comparison with state-of-the-art baselines while keeping the effectiveness at the highest levels.
引用
收藏
页码:453 / 472
页数:20
相关论文
共 26 条
  • [11] Reducing healthcare disparities using multiple multiethnic data distributions with fine-tuning of transfer learning
    Toseef, Muhammad
    Li, Xiangtao
    Wong, Ka-Chun
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (03)
  • [12] Fine-Tuning Transformer-Based Representations in Active Learning for Labelling Crisis Dataset of Tweets
    Paul N.R.
    Balabantaray R.C.
    Sahoo D.
    SN Computer Science, 4 (5)
  • [13] Learning Task-Specific Initialization for Effective Federated Continual Fine-Tuning of Foundation Model Adapters
    Peng, Danni
    Wang, Yuan
    Fu, Huazhu
    Wee, Qingsong
    Liu, Yong
    Goh, Rick Siow Mong
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 811 - 816
  • [14] Active Learning for Identifying Disaster-Related Tweets: A Comparison with Keyword Filtering and Generic Fine-Tuning
    Hanny, David
    Schmidt, Sebastian
    Resch, Bernd
    INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 2, INTELLISYS 2024, 2024, 1066 : 126 - 142
  • [15] Warm Start Active Learning with Proxy Labels and Selection via Semi-supervised Fine-Tuning
    Nath, Vishwesh
    Yang, Dong
    Roth, Holger R.
    Xu, Daguang
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VIII, 2022, 13438 : 297 - 308
  • [16] Annotating Data for Fine-Tuning a Neural Ranker? Current Active Learning Strategies are not Better than Random Selection
    Althammer, Sophia
    Zuccon, Guido
    Hofstaetter, Sebastian
    Verberne, Suzan
    Hanbury, Allan
    ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL IN THE ASIA PACIFIC REGION, SIGIR-AP 2023, 2023, : 139 - 149
  • [17] Designing a Fine-Tuning Tool for Machine Learning with High-Speed and Low-Power Processing
    Sato, Tomoaki
    Chivapreecha, Sorawat
    Higuchi, Kohji
    Moungnoul, Phichet
    2018 18TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES (ISCIT), 2018, : 204 - 207
  • [18] Internet of Medical Things: An Effective and Fully Automatic IoT Approach Using Deep Learning and Fine-Tuning to Lung CT Segmentation
    de Freitas Souza, Luis Fabricio
    Lima Silva, Iagson Carlos
    Marques, Adriell Gomes
    Silva, Francisco Hercules dos S.
    Nunes, Virginia Xavier
    Hassan, Mohammad Mehedi
    de Albuquerque, Victor Hugo C.
    Reboucas Filho, Pedro P.
    SENSORS, 2020, 20 (23) : 1 - 24
  • [19] ACTIVE LEARNING GUIDED FINE-TUNING FOR ENHANCING SELF-SUPERVISED BASED MULTI-LABEL CLASSIFICATION OF REMOTE SENSING IMAGES
    Moellenbrok, Lars
    Demir, Beguem
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 4986 - 4989
  • [20] Secure Federated Learning Across Heterogeneous Cloud and High-Performance Computing Resources: A Case Study on Federated Fine-Tuning of LLaMA 2
    Li, Zilinghan
    He, Shilan
    Chaturvedi, Pranshu
    Kindratenko, Volodymyr
    Huerta, Eliu A.
    Kim, Kibaek
    Madduri, Ravi
    COMPUTING IN SCIENCE & ENGINEERING, 2024, 26 (03) : 52 - 58