Attention-based Bidirectional Long Short-Term Memory Networks for Relation Classification Using Knowledge Distillation from BERT

被引:15
|
作者
Wang, Zihan [1 ]
Yang, Bo [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
relation classification; natural language processing; knowledge distillation;
D O I
10.1109/DASC-PICom-CBDCom-CyberSciTech49142.2020.00100
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Relation classification is an important task in the field of natural language processing. Today the best-performing models often use huge, transformer-based neural architectures like BERT and XLNet and have hundreds of millions of network parameters. These large neural networks have led to the belief that the shallow neural networks of the previous generation for relation classification are obsolete. However, due to large network size and low inference speed, these models may be impractical in on-line real-time systems or resource-restricted systems. To address this issue, we try to accelerate these well-performing language models by compressing them. Specifically, we distill knowledge for relation classification from a huge, transformer-based language model, BERT, into an Attention-Based Bidirectional Long Short-Term Memory Network. We run our model on the SemEval-2010 relation classification task. According to the experiment results, the performance of our model exceeds that of other LSTM-based methods and almost catches up that of BERT. For model inference time, our model has 157 times fewer network parameters, and as a result, it uses about 229 times less inference time than BERT.
引用
收藏
页码:562 / 568
页数:7
相关论文
共 50 条
  • [1] Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification
    Zhou, Peng
    Shi, Wei
    Tian, Jun
    Qi, Zhenyu
    Li, Bingchen
    Hao, Hongwei
    Xu, Bo
    [J]. PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2016), VOL 2, 2016, : 207 - 212
  • [2] Hyperspectral Image Classification Using Attention-Based Bidirectional Long Short-Term Memory Network
    Mei, Shaohui
    Li, Xingang
    Liu, Xiao
    Cai, Huimin
    Du, Qian
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [3] Modeling citation worthiness by using attention-based bidirectional long short-term memory networks and interpretable models
    Tong Zeng
    Daniel E. Acuna
    [J]. Scientometrics, 2020, 124 : 399 - 428
  • [4] Modeling citation worthiness by using attention-based bidirectional long short-term memory networks and interpretable models
    Zeng, Tong
    Acuna, Daniel E.
    [J]. SCIENTOMETRICS, 2020, 124 (01) : 399 - 428
  • [5] Relation extraction in Chinese using attention-based bidirectional long short- term networks
    Zhang, Yanzi
    [J]. PEERJ COMPUTER SCIENCE, 2023, 9
  • [6] Classification of causes of speech recognition errors using attention-based bidirectional long short-term memory and modulation spectrum
    Santoso, Jennifer
    Yamada, Takeshi
    Makino, Shoji
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 302 - 306
  • [7] Classification of causes of speech recognition errors using attention-based bidirectional long short-term memory and modulation spectrum
    Santoso, Jennifer
    Yamada, Takeshi
    Makino, Shoji
    [J]. 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019, 2019, : 302 - 306
  • [8] Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks
    Canlin Zhang
    Daniel Biś
    Xiuwen Liu
    Zhe He
    [J]. BMC Bioinformatics, 20
  • [9] Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks
    Zhang, Canlin
    Bis, Daniel
    Liu, Xiuwen
    He, Zhe
    [J]. BMC BIOINFORMATICS, 2019, 20 (Suppl 16)
  • [10] Attention-based bidirectional long short-term memory networks for extracting temporal relationships from clinical discharge summaries
    Alfattni, Ghada
    Peek, Niels
    Nenadic, Goran
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2021, 123