Layer-Wise Representation Fusion for Compositional Generalization

被引:0
|
作者
Zheng, Yafang [1 ,2 ]
Lin, Lei [1 ,2 ,3 ]
Li, Shuangtao [1 ,2 ]
Yuan, Yuxuan [1 ,2 ]
Lai, Zhaohong [1 ,2 ]
Liu, Shan [1 ,2 ]
Fu, Biao [1 ,2 ]
Chen, Yidong [1 ,2 ]
Shi, Xiaodong [1 ,2 ]
机构
[1] Xiamen Univ, Sch Informat, Dept Artificial Intelligence, Xiamen, Peoples R China
[2] Xiamen Univ, Key Lab Digital Protect & Intelligent Proc Intang, Minist Culture & Tourism, Xiamen, Peoples R China
[3] Kuaishou Technol, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing neural models are demonstrated to struggle with compositional generalization (CG), i.e., the ability to systematically generalize to unseen compositions of seen components. A key reason for failure on CG is that the syntactic and semantic representations of sequences in both the uppermost layer of the encoder and decoder are entangled. However, previous work concentrates on separating the learning of syntax and semantics instead of exploring the reasons behind the representation entanglement (RE) problem to solve it. We explain why it exists by analyzing the representation evolving mechanism from the bottom to the top of the Transformer layers. We find that the "shallow" residual connections within each layer fail to fuse previous layers' information effectively, leading to information forgetting between layers and further the RE problems. Inspired by this, we propose LRF, a novel Layer-wise Representation Fusion framework for CG, which learns to fuse previous layers' information back into the encoding and decoding process effectively through introducing a fuse -attention module at each encoder and decoder layer. LRF achieves promising results on two realistic benchmarks, empirically demonstrating the effectiveness of our proposal. Codes are available at https://github.com/thinkaboutzero/LRF.
引用
收藏
页码:19706 / 19714
页数:9
相关论文
共 50 条
  • [1] PLACE Dropout: A Progressive Layer-wise and Channel-wise Dropout for Domain Generalization
    Guo, Jintao
    Qi, Lei
    Shi, Yinghuan
    Gao, Yang
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (03)
  • [2] Explainable AI for layer-wise emission prediction in laser fusion
    Guo, Weihong Grace
    Gawade, Vidita
    Zhang, Bi
    Guo, Yuebin
    [J]. CIRP ANNALS-MANUFACTURING TECHNOLOGY, 2023, 72 (01) : 437 - 440
  • [3] LAYER-WISE ANALYSIS OF A SELF-SUPERVISED SPEECH REPRESENTATION MODEL
    Pasad, Ankita
    Chou, Ju-Chieh
    Livescu, Karen
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 914 - 921
  • [4] Cross-language question retrieval with multi-layer representation and layer-wise adversary
    Li, Bo
    Du, Xiaodong
    Chen, Meng
    [J]. INFORMATION SCIENCES, 2020, 527 : 241 - 252
  • [5] Sequential attention layer-wise fusion network for multi-view classification
    Teng, Qing
    Yang, Xibei
    Sun, Qiguo
    Wang, Pingxin
    Wang, Xun
    Xu, Taihua
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, : 5549 - 5561
  • [6] Layer-wise enhanced transformer with multi-modal fusion for image caption
    Li, Jingdan
    Wang, Yi
    Zhao, Dexin
    [J]. MULTIMEDIA SYSTEMS, 2023, 29 (03) : 1043 - 1056
  • [7] In-situ point cloud fusion for layer-wise monitoring of additive manufacturing
    Ye, Zehao
    Liu, Chenang
    Tian, Wenmeng
    Kan, Chen
    [J]. JOURNAL OF MANUFACTURING SYSTEMS, 2021, 61 : 210 - 222
  • [8] Layer-wise enhanced transformer with multi-modal fusion for image caption
    Jingdan Li
    Yi Wang
    Dexin Zhao
    [J]. Multimedia Systems, 2023, 29 : 1043 - 1056
  • [9] WAYS OF IMPROVING LAYER-WISE CARBONISATION
    SYSKOV, KI
    RAKHANSK.PD
    [J]. COKE & CHEMISTRY USSR, 1970, (07): : 13 - &
  • [10] Towards Layer-wise Image Vectorization
    Ma, Xu
    Zhou, Yuqian
    Xu, Xingqian
    Sun, Bin
    Filev, Valerii
    Orlov, Nikita
    Fu, Yun
    Shi, Humphrey
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 16293 - 16302