Cross-Lingual Transfer Learning for Statistical Type Inference

被引:0
|
作者
Li, Zhiming [1 ]
Xie, Xiaofei [2 ]
Li, Haoliang [3 ]
Xu, Zhengzi [1 ]
Li, Yi [1 ]
Liu, Yang [1 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
[2] Singapore Management Univ, Singapore, Singapore
[3] City Univ Hong Kong, Hong Kong, Peoples R China
基金
新加坡国家研究基金会;
关键词
Deep Learning; Transfer Learning; Type Inference;
D O I
10.1145/3533767.3534411
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Hitherto statistical type inference systems rely thoroughly on supervised learning approaches, which require laborious manual effort to collect and label large amounts of data. Most Turing-complete imperative languages share similar control- and data-flow structures, which make it possible to transfer knowledge learned from one language to another. In this paper, we propose a cross-lingual transfer learning framework, Plato, for statistical type inference, which allows us to leverage prior knowledge learned from the labeled dataset of one language and transfer it to the others, e.g., Python to JavaScript, Java to JavaScript, etc. Plato is powered by a novel kernelized attention mechanism to constrain the attention scope of the backbone Transformer model such that model is forced to base its prediction on commonly shared features among languages. In addition, we propose the syntax enhancement that augments the learning on the feature overlap among language domains. Furthermore, Plato can also be used to improve the performance of the conventional supervised-based type inference by introducing cross-language augmentation, which enables the model to learn more general features across multiple languages. We evaluated Plato under two settings: 1) under the cross-domain scenario that the target language data is not labeled or labeled partially, the results show that Plato outperforms the state-of-the-art domain transfer techniques by a large margin, e.g., it improves the Python to Type-Script baseline by +14.6%@EM, +18.6%@weighted-F1, and 2) under the conventional monolingual supervised scenario, Plato improves the Python baseline by +4.10%@EM, +1.90%@weighted-F1 with the introduction of the cross-lingual augmentation.
引用
收藏
页码:239 / 250
页数:12
相关论文
共 50 条
  • [31] Cross-lingual sentiment transfer with limited resources
    Rasooli, Mohammad Sadegh
    Farra, Noura
    Radeva, Axinia
    Yu, Tao
    McKeown, Kathleen
    MACHINE TRANSLATION, 2018, 32 (1-2) : 143 - 165
  • [32] Cross-Lingual Knowledge Transfer for Clinical Phenotyping
    Papaioannou, Jens-Michalis
    Grundmann, Paul
    van Aken, Betty
    Samaras, Athanasios
    Kyparissidis, Ilias
    Giannakoulas, George
    Gers, Felix
    Loeser, Alexander
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 900 - 909
  • [33] Metaphor Detection with Cross-Lingual Model Transfer
    Tsvetkov, Yulia
    Boytsov, Leonid
    Gershman, Anatole
    Nyberg, Eric
    Dyer, Chris
    PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2014, : 248 - 258
  • [34] Cross-Lingual Transfer of Cognitive Processing Complexity
    Pouw, Charlotte
    Hollenstein, Nora
    Beinborn, Lisa
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 655 - 669
  • [35] Cross-lingual lexical triggers in statistical language modeling
    Kim, W
    Khudanpur, S
    PROCEEDINGS OF THE 2003 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, 2003, : 17 - 24
  • [36] Semi-supervised Learning on Cross-Lingual Sentiment Analysis with Space Transfer
    He, Xiaonan
    Zhang, Hui
    Chao, Wenhan
    Wang, Daqing
    2015 IEEE FIRST INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (BIGDATASERVICE 2015), 2015, : 371 - 377
  • [37] Domain Mismatch Doesn't Always Prevent Cross-Lingual Transfer Learning
    Edmiston, Daniel
    Keung, Phillip
    Smith, Noah A.
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 892 - 899
  • [38] Curriculum meta-learning for zero-shot cross-lingual transfer
    Doan, Toan
    Le, Bac
    KNOWLEDGE-BASED SYSTEMS, 2024, 301
  • [39] Rumour Detection via Zero-Shot Cross-Lingual Transfer Learning
    Tian, Lin
    Zhang, Xiuzhen
    Lau, Jey Han
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, 2021, 12975 : 603 - 618
  • [40] Boosting Cross-Lingual Transfer via Self-Learning with Uncertainty Estimation
    Xu, Liyan
    Zhang, Xuchao
    Zhao, Xujiang
    Chen, Haifeng
    Chen, Feng
    Choi, Jinho D.
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 6716 - 6723