From Characters toWords: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding

被引：0

作者：

Sun, Li ^{[1
]}

Luisier, Florian ^{[2
]}

Batmanghelich, Kayhan ^{[1
]}

Florencio, Dinei ^{[2
]}

Zhang, Cha ^{[2
]}

机构：

[1] Boston Univ, Boston, MA 02215 USA

[2] Microsoft, Redmond, WA USA

来源：

PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1 | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Current state-of-the-art models for natural language understanding require a preprocessing step to convert raw text into discrete tokens. This process known as tokenization relies on a pre-built vocabulary of words or sub-word morphemes. This fixed vocabulary limits the model's robustness to spelling errors and its capacity to adapt to new domains. In this work, we introduce a novel open-vocabulary language model that adopts a hierarchical two-level approach: one at the word level and another at the sequence level. Concretely, we design an intraword module that uses a shallow Transformer architecture to learn word representations from their characters, and a deep inter-word Transformer module that contextualizes each word representation by attending to the entire word sequence. Our model thus directly operates on character sequences with explicit awareness of word boundaries, but without biased sub-word or word-level vocabulary. Experiments on various downstream tasks show that our method outperforms strong baselines. We also demonstrate that our hierarchical model is robust to textual corruption and domain shift.

引用

页码：3605 / 3620

页数：16

共 50 条

[1] A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-Language Model
Xu, Mengde
Zhang, Zheng
Wei, Fangyun
Lin, Yutong
Cao, Yue
Hu, Han
Bai, Xiang
COMPUTER VISION, ECCV 2022, PT XXIX, 2022, 13689 : 736 - 753
[2] OPEN-VOCABULARY SKELETON ACTION RECOGNITION WITH DIFFUSION GRAPH CONVOLUTIONAL NETWORK AND PRE-TRAINED VISION-LANGUAGE MODELS
Wei, Chao
Deng, Zhidong
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 3195 - 3199
[3] ARoBERT: An ASR Robust Pre-Trained Language Model for Spoken Language Understanding
Wang, Chengyu
Dai, Suyang
Wang, Yipeng
Yang, Fei
Qiu, Minghui
Chen, Kehan
Zhou, Wei
Huang, Jun
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1207 - 1218
[4] Learning to Generate Language-supervised and Open-vocabulary Scene Graph using Pre-trained Visual-Semantic Space
Zhang, Yong
Pan, Yingwei
Yao, Ting
Huang, Rui
Mei, Tao
Chen, Chang-Wen
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2915 - 2924
[5] Pre-trained Language Model Representations for Language Generation
Edunov, Sergey
Baevski, Alexei
Auli, Michael
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4052 - 4059
[6] Zero-Shot Open-Vocabulary Tracking with Large Pre-Trained Models
Chu, Wen-Hsuan
Harley, Adam W.
Tokmakov, Pavel
Dave, Achal
Guibas, Leonidas
Fragkiadaki, Katerina
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 4916 - 4923
[7] Hyperbolic Pre-Trained Language Model
Chen, Weize
Han, Xu
Lin, Yankai
He, Kaichen
Xie, Ruobing
Zhou, Jie
Liu, Zhiyuan
Sun, Maosong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3101 - 3112
[8] Exploring Pre-trained Language Models for Vocabulary Alignment in the UMLS
Hao, Xubing
Abeysinghe, Rashmie
Shi, Jay
Cui, Licong
ARTIFICIAL INTELLIGENCE IN MEDICINE, PT I, AIME 2024, 2024, 14844 : 273 - 278
[9] AnchiBERT: A Pre-Trained Model for Ancient Chinese Language Understanding and Generation
Tian, Huishuang
Yang, Kexin
Liu, Dayiheng
Lv, Jiancheng
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[10] JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding
Zhao, Wayne Xin
Zhou, Kun
Gong, Zheng
Zhang, Beichen
Zhou, Yuanhang
Sha, Jing
Chen, Zhigang
Wang, Shijin
Liu, Cong
Wen, Ji-Rong
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4571 - 4581

← 1 2 3 4 5 →