Tree Transformer: Integrating Tree Structures into Self-Attention

被引：0

作者：

Wang, Yau-Shian ^{[1
]}

Lee, Hung-Yi ^{[1
]}

Chen, Yun-Nung ^{[1
]}

机构：

[1] Natl Taiwan Univ, Taipei, Taiwan

来源：

2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pre-training Transformer from large-scale raw texts and fine-tuning on the desired task have achieved state-of-the-art results on diverse NLP tasks. However, it is unclear what the learned attention captures. The attention computed by attention heads seems not to match human intuitions about hierarchical structures. This paper proposes Tree Transformer, which adds an extra constraint to attention heads of the bidirectional Transformer encoder in order to encourage the attention heads to follow tree structures. The tree structures can be automatically induced from raw texts by our proposed "Constituent Attention" module, which is simply implemented by self-attention between two adjacent words. With the same training procedure identical to BERT, the experiments demonstrate the effectiveness of Tree Transformer in terms of inducing tree structures, better language modeling, and further learning more explainable attention scores(1).

引用

页码：1061 / 1070

页数：10

共 50 条

[41] ENHANCING TONGUE REGION SEGMENTATION THROUGH SELF-ATTENTION AND TRANSFORMER BASED
Song, Yihua
Li, Can
Zhang, Xia
Liu, Zhen
Song, Ningning
Zhou, Zuojian
JOURNAL OF MECHANICS IN MEDICINE AND BIOLOGY, 2024, 24 (02)
[42] EViT: An Eagle Vision Transformer With Bi-Fovea Self-Attention
Shi, Yulong
Sun, Mingwei
Wang, Yongshuai
Ma, Jiahao
Chen, Zengqiang
IEEE TRANSACTIONS ON CYBERNETICS, 2025, 55 (03) : 1288 - 1300
[43] Re-Transformer: A Self-Attention Based Model for Machine Translation
Liu, Huey-Ing
Chen, Wei-Lin
AI IN COMPUTATIONAL LINGUISTICS, 2021, 189 : 3 - 10
[44] Wavelet Frequency Division Self-Attention Transformer Image Deraining Network
Fang, Siyan
Liu, Bin
Computer Engineering and Applications, 2024, 60 (06) : 259 - 273
[45] MULTI-VIEW SELF-ATTENTION BASED TRANSFORMER FOR SPEAKER RECOGNITION
Wang, Rui
Ao, Junyi
Zhou, Long
Liu, Shujie
Wei, Zhihua
Ko, Tom
Li, Qing
Zhang, Yu
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6732 - 6736
[46] Bottleneck Transformer model with Channel Self-Attention for skin lesion classification
Tada, Masato
Han, Xian-Hua
2023 18TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND APPLICATIONS, MVA, 2023,
[47] A self-attention armed optronic transformer in imaging through scattering media
Huang, Zicheng
Shi, Mengyang
Ma, Jiahui
Gao, Yesheng
Liu, Xingzhao
OPTICS COMMUNICATIONS, 2024, 571
[48] CNN-TRANSFORMER WITH SELF-ATTENTION NETWORK FOR SOUND EVENT DETECTION
Wakayama, Keigo
Saito, Shoichiro
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 806 - 810
[49] Global-Local Self-Attention Based Transformer for Speaker Verification
Xie, Fei
Zhang, Dalong
Liu, Chengming
APPLIED SCIENCES-BASEL, 2022, 12 (19):
[50] EEG-Transformer: Self-attention from Transformer Architecture for Decoding EEG of Imagined Speech
Lee, Young-Eun
Lee, Seo-Hyun
10TH INTERNATIONAL WINTER CONFERENCE ON BRAIN-COMPUTER INTERFACE (BCI2022), 2022,

← 1 2 3 4 5 →