Multicore based least confidence query sampling strategy to speed up active learning approach for named entity recognition

被引：0

作者：

Ankit Agrawal

Sarsij Tripathi

Manu Vardhan

机构：

[1] National Institute of Technology Raipur,

[2] Motilal Nehru National Institute of Technology Allahabad,undefined

来源：

Computing | 2023年 / 105卷

关键词：

Least confidence; Active learning; Named entity recognition; Speed up; 68U15; 68U01; 68W10; 68T50;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In the present era, there is a large amount of new data available readily from different sources to collect and store. One of the main problems is to label these new data for various machine learning applications correctly. The active learning approach presents a unique case of machine learning which is widely used to solve the above problem by significantly minimizing the need for labeled data. It aims to select the most appropriate samples from the unlabeled data to be correctly labeled by the oracle and is passed to train the active learner incrementally. There are several different query sampling strategies that exist using which the appropriate samples are selected. One of the main problems with the active learning approach is that it is very time-consuming. So in this research work, a new multi-core-based algorithm is proposed to speed up the active learning approach, which can utilize the complete computational resources present in the system. The experiments have been performed for the problem of named entity recognition which deals with labeling the sequences of words in an unstructured text by classifying them into pre-existing categories. The proposed algorithm is evaluated in terms of both: the performance and execution time over three named entity recognition corpus of distinct biomedical domains. The evaluation results shows considerable improvement in terms of execution time for the proposed active learning algorithm than the existing active learning approach.

引用

页码：979 / 997

页数：18

共 35 条

[31] A Deep Learning Based Approach for Biomedical Named Entity Recognition Using Multitasking Transfer Learning with BiLSTM, BERT and CRF
Pooja H.
Jagadeesh M.P.P.
SN Computer Science, 5 (5)
[32] GRADIENT-BASED ACTIVE LEARNING QUERY STRATEGY FOR END-TO-END SPEECH RECOGNITION
Yuan, Yang
Chung, Soo-Whan
Kang, Hong-Goo
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2832 - 2836
[33] A deep active learning-based and crowdsourcing-assisted solution for named entity recognition in Chinese historical corpora
Yan, Chengxi
Tang, Xuemei
Yang, Hao
Wang, Jun
ASLIB JOURNAL OF INFORMATION MANAGEMENT, 2023, 75 (03) : 455 - 480
[34] BERT-Based Transfer-Learning Approach for Nested Named-Entity Recognition Using Joint Labeling
Agrawal, Ankit
Tripathi, Sarsij
Vardhan, Manu
Sihag, Vikas
Choudhary, Gaurav
Dragoni, Nicola
APPLIED SCIENCES-BASEL, 2022, 12 (03):
[35] Named Entity Recognition and Relation Extraction for COVID-19: Explainable Active Learning with Word2vec Embeddings and Transformer-Based BERT Models
Arguello-Casteleiro, M.
Maroto, N.
Wroe, C.
Torrado, C. Sevillano
Henson, C.
Des-Diz, J.
Fernandez-Prieto, M. J.
Furmston, T.
Fernandez, D. Maseda
Kulshrestha, M.
Stevens, R.
Keane, J.
Peters, S.
ARTIFICIAL INTELLIGENCE XXXVIII, 2021, 13101 : 158 - 163

← 1 2 3 4 →