Reducing the user labeling effort in effective high recall tasks by fine-tuning active learning

被引：3

作者：

Dal Bianco, Guilherme ^{[1
]}

Duarte, Denio ^{[1
]}

Goncalves, Marcos Andre ^{[2
]}

机构：

[1] Univ Fed Fronteira Sul, Campus Chapeco, Chapeco, Brazil

[2] Univ Fed Minas Gerais, Dept Ciencia Comp, Belo Horizonte, Brazil

来源：

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS | 2023年 / 61卷 / 02期

关键词：

Information retrieval; Hire; Active learning; SSAR; Labeling process; Supervised classifier; SELECTION;

D O I：

10.1007/s10844-022-00772-y

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

High recall Information REtrieval (HIRE) aims at identifying only and (almost) all relevant documents for a given query. HIRE is paramount in applications such as systematic literature review, medicine, legal jurisprudence, among others. To address the HIRE goals, active learning methods have proven valuable in determining informative and non-redundant documents to reduce user effort for manual labeling. We propose a new active learning framework for the HIRE task. REVEAL-HIRE selects a very reduced set of documents to be labeled, significantly mitigating the user's effort. The proposed approach selects the most representative documents by exploiting a novel, specifically designed active learning strategy for HIRE, called REVEAL (RelEVant rulE-based Active Learning). REVEAL aims at selecting the maximum number of relevant documents for a given query based on discriminative rule-based patterns and a penalization factor. The method is applied to the top-ranked documents to choose the most informative ones to be labeled, a hard task due to data skewness - most documents are irrelevant for a given query. The enhanced active learning process is repeated incrementally until a stopping point is achieved, using REVEAL to identify the point in the process when relevant documents should stop to be sampled. Experimental results in several standard benchmark datasets (e.g. 20-Newsgroups, Trec Total Recall, and CLEF eHealth) demonstrate that REVEAL-HIRE can reduce the user labeling effort up to 3 times (320% of reduction) in comparison with state-of-the-art baselines while keeping the effectiveness at the highest levels.

引用

页码：453 / 472

页数：20

共 26 条

[11] Reducing healthcare disparities using multiple multiethnic data distributions with fine-tuning of transfer learning
Toseef, Muhammad
Li, Xiangtao
Wong, Ka-Chun
BRIEFINGS IN BIOINFORMATICS, 2022, 23 (03)
[12] Fine-Tuning Transformer-Based Representations in Active Learning for Labelling Crisis Dataset of Tweets
Paul N.R.
Balabantaray R.C.
Sahoo D.
SN Computer Science, 4 (5)
[13] Learning Task-Specific Initialization for Effective Federated Continual Fine-Tuning of Foundation Model Adapters
Peng, Danni
Wang, Yuan
Fu, Huazhu
Wee, Qingsong
Liu, Yong
Goh, Rick Siow Mong
2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 811 - 816
[14] Active Learning for Identifying Disaster-Related Tweets: A Comparison with Keyword Filtering and Generic Fine-Tuning
Hanny, David
Schmidt, Sebastian
Resch, Bernd
INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 2, INTELLISYS 2024, 2024, 1066 : 126 - 142
[15] Warm Start Active Learning with Proxy Labels and Selection via Semi-supervised Fine-Tuning
Nath, Vishwesh
Yang, Dong
Roth, Holger R.
Xu, Daguang
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VIII, 2022, 13438 : 297 - 308
[16] Annotating Data for Fine-Tuning a Neural Ranker? Current Active Learning Strategies are not Better than Random Selection
Althammer, Sophia
Zuccon, Guido
Hofstaetter, Sebastian
Verberne, Suzan
Hanbury, Allan
ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL IN THE ASIA PACIFIC REGION, SIGIR-AP 2023, 2023, : 139 - 149
[17] Designing a Fine-Tuning Tool for Machine Learning with High-Speed and Low-Power Processing
Sato, Tomoaki
Chivapreecha, Sorawat
Higuchi, Kohji
Moungnoul, Phichet
2018 18TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES (ISCIT), 2018, : 204 - 207
[18] Internet of Medical Things: An Effective and Fully Automatic IoT Approach Using Deep Learning and Fine-Tuning to Lung CT Segmentation
de Freitas Souza, Luis Fabricio
Lima Silva, Iagson Carlos
Marques, Adriell Gomes
Silva, Francisco Hercules dos S.
Nunes, Virginia Xavier
Hassan, Mohammad Mehedi
de Albuquerque, Victor Hugo C.
Reboucas Filho, Pedro P.
SENSORS, 2020, 20 (23) : 1 - 24
[19] ACTIVE LEARNING GUIDED FINE-TUNING FOR ENHANCING SELF-SUPERVISED BASED MULTI-LABEL CLASSIFICATION OF REMOTE SENSING IMAGES
Moellenbrok, Lars
Demir, Beguem
IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 4986 - 4989
[20] Secure Federated Learning Across Heterogeneous Cloud and High-Performance Computing Resources: A Case Study on Federated Fine-Tuning of LLaMA 2
Li, Zilinghan
He, Shilan
Chaturvedi, Pranshu
Kindratenko, Volodymyr
Huerta, Eliu A.
Kim, Kibaek
Madduri, Ravi
COMPUTING IN SCIENCE & ENGINEERING, 2024, 26 (03) : 52 - 58

← 1 2 3 →