CORAL: Collaborative Automatic Labeling System based on Large Language Models

被引：0

作者：

Zhu, Zhen ^{[1
]}

Wang, Yibo ^{[1
]}

Yang, Shouqing ^{[1
]}

Long, Lin ^{[1
]}

Wu, Runze ^{[2
]}

Tang, Xiu ^{[1
]}

Zhao, Junbo ^{[1
]}

Wang, Haobo ^{[1
]}

机构：

[1] Zhejiang Univ, Hangzhou, Peoples R China

[2] NetEase Fuxi AI Lab, Hangzhou, Peoples R China

来源：

PROCEEDINGS OF THE VLDB ENDOWMENT | 2024年 / 17卷 / 12期

关键词：

D O I：

10.14778/3685800.3685885

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the era of big data, data annotation is integral to numerous applications. However, it is widely acknowledged as a laborious and time-consuming process, significantly impeding the scalability and efficiency of data-driven applications. To reduce the human cost, we demonstrate CORAL, a collaborative automatic labeling system driven by large language models (LLMs), which achieves high-quality annotation with the least human effort. Firstly, CORAL employs LLM to automatically annotate vast datasets, generating coarse-grained labels. Subsequently, a weakly-supervised learning module trains small language models (SLMs) using noisy label learning techniques to distill accurate labels from LLM's annotations. It also allows statistical analysis of model outcomes to identify potentially erroneous labels, reducing the human cost of error detection. Furthermore, CORAL supports iterative refinement by LLMs and SLMs using manually corrected labels, thereby ensuring continual enhancement in annotation quality and model performance. A visual interface enables annotation process monitoring and result analysis.

引用

页码：4401 / 4404

页数：4

共 50 条

[1] An HMM based Semi-Automatic Syllable labeling System for Manipuri language
Nandakishor, Salam
Dutta, S. K.
Singh, L. Joyprakash
2015 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION & AUTOMATION (ICCCA), 2015, : 1044 - 1047
[2] JusticeAI: A Large Language Models Inspired Collaborative and Cross-Domain Multimodal System for Automatic Judicial Rulings in Smart Courts
Samee, Nagwan Abdel
Alabdulhafith, Maali
Shah, Syed Muhammad Ahmed Hassan
Rizwan, Atif
IEEE ACCESS, 2024, 12 : 173091 - 173107
[3] Automatic recognition of cross-language classic entities based on large language models
Xu, Qiankun
Liu, Yutong
Wang, Dongbo
Huang, Shuiqing
NPJ HERITAGE SCIENCE, 2025, 13 (01):
[4] A Recommendation System for Prosumers Based on Large Language Models
Oprea, Simona-Vasilica
Bara, Adela
SENSORS, 2024, 24 (11)
[5] Automatic Evaluation of Attribution by Large Language Models
Yue, Xiang
Wang, Boshi
Chen, Ziru
Zhang, Kai
Su, Yu
Sun, Huan
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 4615 - 4635
[6] Prompting Large Language Models for Automatic Question Tagging
Xu, Nuojia
Xue, Dizhan
Qian, Shengsheng
Fang, Quan
Hu, Jun
MACHINE INTELLIGENCE RESEARCH, 2025,
[7] Automatic Scoring of Metaphor Creativity with Large Language Models
DiStefano, Paul V.
Patterson, John D.
Beaty, Roger E.
CREATIVITY RESEARCH JOURNAL, 2024,
[8] Automatic Model Selection with Large Language Models for Reasoning
Zhao, James Xu
Xie, Yuxi
Kawaguchi, Kenji
He, Junxian
Xie, Michael Qizhe
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 758 - 783
[9] A Survey on Automatic Generation of Figurative Language: From Rule-based Systems to Large Language Models
Lai, Huiyuan
Nissim, Malvina
ACM COMPUTING SURVEYS, 2024, 56 (10)
[10] Large language model based collaborative robot system for daily task assistance
Seunguk Choi
David Kim
Myeonggyun Ahn
Dongil Choi
JMST Advances, 2024, 6 (3) : 315 - 327

← 1 2 3 4 5 →