CORAL: Collaborative Automatic Labeling System based on Large Language Models

被引:0
|
作者
Zhu, Zhen [1 ]
Wang, Yibo [1 ]
Yang, Shouqing [1 ]
Long, Lin [1 ]
Wu, Runze [2 ]
Tang, Xiu [1 ]
Zhao, Junbo [1 ]
Wang, Haobo [1 ]
机构
[1] Zhejiang Univ, Hangzhou, Peoples R China
[2] NetEase Fuxi AI Lab, Hangzhou, Peoples R China
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2024年 / 17卷 / 12期
关键词
D O I
10.14778/3685800.3685885
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the era of big data, data annotation is integral to numerous applications. However, it is widely acknowledged as a laborious and time-consuming process, significantly impeding the scalability and efficiency of data-driven applications. To reduce the human cost, we demonstrate CORAL, a collaborative automatic labeling system driven by large language models (LLMs), which achieves high-quality annotation with the least human effort. Firstly, CORAL employs LLM to automatically annotate vast datasets, generating coarse-grained labels. Subsequently, a weakly-supervised learning module trains small language models (SLMs) using noisy label learning techniques to distill accurate labels from LLM's annotations. It also allows statistical analysis of model outcomes to identify potentially erroneous labels, reducing the human cost of error detection. Furthermore, CORAL supports iterative refinement by LLMs and SLMs using manually corrected labels, thereby ensuring continual enhancement in annotation quality and model performance. A visual interface enables annotation process monitoring and result analysis.
引用
收藏
页码:4401 / 4404
页数:4
相关论文
共 50 条
  • [1] An HMM based Semi-Automatic Syllable labeling System for Manipuri language
    Nandakishor, Salam
    Dutta, S. K.
    Singh, L. Joyprakash
    2015 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION & AUTOMATION (ICCCA), 2015, : 1044 - 1047
  • [2] JusticeAI: A Large Language Models Inspired Collaborative and Cross-Domain Multimodal System for Automatic Judicial Rulings in Smart Courts
    Samee, Nagwan Abdel
    Alabdulhafith, Maali
    Shah, Syed Muhammad Ahmed Hassan
    Rizwan, Atif
    IEEE ACCESS, 2024, 12 : 173091 - 173107
  • [3] Automatic recognition of cross-language classic entities based on large language models
    Xu, Qiankun
    Liu, Yutong
    Wang, Dongbo
    Huang, Shuiqing
    NPJ HERITAGE SCIENCE, 2025, 13 (01):
  • [4] A Recommendation System for Prosumers Based on Large Language Models
    Oprea, Simona-Vasilica
    Bara, Adela
    SENSORS, 2024, 24 (11)
  • [5] Automatic Evaluation of Attribution by Large Language Models
    Yue, Xiang
    Wang, Boshi
    Chen, Ziru
    Zhang, Kai
    Su, Yu
    Sun, Huan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 4615 - 4635
  • [6] Prompting Large Language Models for Automatic Question Tagging
    Xu, Nuojia
    Xue, Dizhan
    Qian, Shengsheng
    Fang, Quan
    Hu, Jun
    MACHINE INTELLIGENCE RESEARCH, 2025,
  • [7] Automatic Scoring of Metaphor Creativity with Large Language Models
    DiStefano, Paul V.
    Patterson, John D.
    Beaty, Roger E.
    CREATIVITY RESEARCH JOURNAL, 2024,
  • [8] Automatic Model Selection with Large Language Models for Reasoning
    Zhao, James Xu
    Xie, Yuxi
    Kawaguchi, Kenji
    He, Junxian
    Xie, Michael Qizhe
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 758 - 783
  • [9] A Survey on Automatic Generation of Figurative Language: From Rule-based Systems to Large Language Models
    Lai, Huiyuan
    Nissim, Malvina
    ACM COMPUTING SURVEYS, 2024, 56 (10)
  • [10] Large language model based collaborative robot system for daily task assistance
    Seunguk Choi
    David Kim
    Myeonggyun Ahn
    Dongil Choi
    JMST Advances, 2024, 6 (3) : 315 - 327