Using Pre-trained Language Model to Enhance Active Learning for Sentence Matching

被引：0

作者：

Bai, Guirong ^{[1
,2
]}

He, Shizhu ^{[1
,2
]}

Liu, Kang ^{[1
,2
]}

Zhao, Jun ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Natl Lab Pattern Recognit, Inst Automat, 95 Zhongguancun East Rd, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, 95 Zhongguancun East Rd, Beijing 100190, Peoples R China

来源：

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING | 2022年 / 21卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Sentence matching; active learning; pre-trained language model;

D O I：

10.1145/3480937

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Active learning is an effective method to substantially alleviate the problem of expensive annotation cost for data-driven models. Recently, pre-trained language models have been demonstrated to be powerful for learning language representations. In this article, we demonstrate that the pre-trained language model can also utilize its learned textual characteristics to enrich criteria of active learning. Specifically, we provide extra textual criteria with the pre-trained language model to measure instances, including noise, coverage, and diversity. With these extra textual criteria, we can select more efficient instances for annotation and obtain better results. We conduct experiments on both English and Chinese sentence matching datasets. The experimental results show that the proposed active learning approach can be enhanced by the pre-trained language model and obtain better performance.

引用

页数：19

共 50 条

[1] Cross-sentence Pre-trained model for Interactive QA matching
Wu, Jinmeng
Hao, Yanbin
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5417 - 5424
[2] SiBert: Enhanced Chinese Pre-trained Language Model with Sentence Insertion
Chen, Jiahao
Cao, Chenjie
Jiang, Xiuyan
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2405 - 2412
[3] Interpretability of Entity Matching Based on Pre-trained Language Model
Liang Z.
Wang H.-Z.
Dai J.-J.
Shao X.-Y.
Ding X.-O.
Mu T.-Y.
Ruan Jian Xue Bao/Journal of Software, 2023, 34 (03): : 1087 - 1108
[4] Talent Supply and Demand Matching Based on Prompt Learning and the Pre-Trained Language Model
Li, Kunping
Liu, Jianhua
Zhuang, Cunbo
Applied Sciences (Switzerland), 2025, 15 (05):
[5] On the Sentence Embeddings from Pre-trained Language Models
Li, Bohan
Zhou, Hao
He, Junxian
Wang, Mingxuan
Yang, Yiming
Li, Lei
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9119 - 9130
[6] Active Learning on Pre-trained Language Model with Task-Independent Triplet Loss
Seo, Seungmin
Kim, Donghyun
Ahn, Youbin
Lee, Kyong-Ho
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11276 - 11284
[7] Vision Enhanced Generative Pre-trained Language Model for Multimodal Sentence Summarization
Jing, Liqiang
Li, Yiren
Xu, Junhao
Yu, Yongcan
Shen, Pei
Song, Xuemeng
MACHINE INTELLIGENCE RESEARCH, 2023, 20 (02) : 289 - 298
[8] Schema matching based on energy domain pre-trained language model
Pan Z.
Yang M.
Monti A.
Energy Informatics, 2023, 6 (Suppl 1)
[9] Learning and Evaluating a Differentially Private Pre-trained Language Model
Hoory, Shlomo
Feder, Amir
Tendler, Avichai
Cohen, Alon
Erell, Sofia
Laish, Itay
Nakhost, Hootan
Stemmer, Uri
Benjamini, Ayelet
Hassidim, Avinatan
Matias, Yossi
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1178 - 1189
[10] Hyperbolic Pre-Trained Language Model
Chen, Weize
Han, Xu
Lin, Yankai
He, Kaichen
Xie, Ruobing
Zhou, Jie
Liu, Zhiyuan
Sun, Maosong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3101 - 3112

← 1 2 3 4 5 →