Zero-Shot Learners for Natural Language Understanding via a Unified Multiple-Choice Perspective

被引:1
|
作者
Wang, Junjie [1 ]
Yang, Ping [2 ]
Gan, Ruyi [2 ]
Zhang, Yuxiang [1 ]
Zhang, Jiaxing [2 ]
Sakai, Tetsuya [1 ]
机构
[1] Waseda Univ, Shinjuku Ku, Tokyo 1698555, Japan
[2] Int Digital Econ Acad IDEA, Futian 518045, Shenzhen, Peoples R China
关键词
Multitasking; Multi-task learning; natural language understanding; zero-shot learning; KNOWLEDGE;
D O I
10.1109/ACCESS.2023.3343123
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Zero-shot learning is an approach where models generalize to unseen tasks without direct training on them. We introduce the Unified Multiple-Choice (UniMC) framework, which is format-independent, compatible with various formats, and applicable to tasks like text classification and sentiment analysis. Furthermore, we design a two-stage tuning method, initially training on multiple-choice formats to develop format-agnostic capabilities, and subsequently enabling direct predictions on unseen tasks for zero-shot learning. Our methodology avoids issues in large-scale models like FLAN, enhancing generalization and reducing parameters. In experiments, UniMC shows State-of-the-Art (SOTA) performance across out-of-domain and in-domain benchmarks, with only 235M parameters, far fewer than previous methods. Moreover, the UniMC-Chinese model excels beyond human performance on benchmarks like EPRSTMT and CHID-FC, underscoring its generalization capacity across languages. Additionally, ablation experiments demonstrate the effectiveness of our design. The code and model weights are available at https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/main/fengshen/examples/unimc.
引用
收藏
页码:142829 / 142845
页数:17
相关论文
共 50 条
  • [41] Validation of a zero-shot learning natural language processing tool to facilitate data abstraction for urologic research
    Kaufmann, B.
    Busby, D.
    Das, C. K.
    Tillu, N.
    Menon, M.
    Tewari, A. K. T.
    Gorin, M. A.
    EUROPEAN UROLOGY, 2024, 85 : S949 - S950
  • [42] CONNECTING TARGETS VIA LATENT TOPICS AND CONTRASTIVE LEARNING: A UNIFIED FRAMEWORK FOR ROBUST ZERO-SHOT AND FEW-SHOT STANCE DETECTION
    Liu, Rui
    Lin, Zheng
    Fu, Peng
    Liu, Yuanxin
    Wang, Weiping
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7812 - 7816
  • [43] A Natural-Language-Processing-Based Procedure for Generating Distractors for Multiple-Choice Questions
    Baldwin, Peter
    Mee, Janet
    Yaneva, Victoria
    Paniagua, Miguel
    D'Angelo, Jean
    Swygert, Kimberly
    Clauser, Brian E.
    EVALUATION & THE HEALTH PROFESSIONS, 2022, 45 (04) : 327 - 340
  • [44] Large Language Models Are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models
    Deng, Yinlin
    Xia, Chunqiu Steven
    Peng, Haoran
    Yang, Chenyuan
    Zhan, Lingming
    PROCEEDINGS OF THE 32ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2023, 2023, : 423 - 435
  • [45] From Masked Language Modeling to Translation: Non-English Auxiliary Tasks Improve Zero-shot Spoken Language Understanding
    van der Goot, Rob
    Sharaf, Ibrahim
    Imankulova, Aizhan
    Ustun, Ahmet
    Stepanovic, Marija
    Ramponi, Alan
    Khairunnisa, Siti Oryza
    Komachi, Mamoru
    Plank, Barbara
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 2479 - 2497
  • [46] Towards zero-shot human-object interaction detection via vision-language integration
    Xue, Weiying
    Liu, Qi
    Wang, Yuxiao
    Wei, Zhenao
    Xing, Xiaofen
    Xu, Xiangmin
    NEURAL NETWORKS, 2025, 187
  • [47] Improving Zero-shot Visual Question Answering via Large Language Models with Reasoning Question Prompts
    Lan, Yunshi
    Li, Xiang
    Liu, Xin
    Li, Yang
    Qin, Wei
    Qian, Weining
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4389 - 4400
  • [48] Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models
    Li, Lin
    Xiao, Jun
    Chen, Guikun
    Shao, Jian
    Zhuang, Yueting
    Chen, Long
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [49] LM-VC: Zero-Shot Voice Conversion via Speech Generation Based on Language Models
    Wang Z.
    Chen Y.
    Xie L.
    Tian Q.
    Wang Y.
    IEEE Signal Processing Letters, 2023, 30 : 1157 - 1161
  • [50] Intent Focused Semantic Parsing and Zero-Shot Learning for Out-of-Domain Detection in Spoken Language Understanding
    Kumar, Niraj
    Baghel, Bhiman Kumar
    IEEE ACCESS, 2021, 9 (09): : 165786 - 165794