共 50 条
- [1] Dynamic Knowledge Distillation for Pre-trained Language Models [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 379 - 389
- [2] Knowledge-Grounded Dialogue Generation with Pre-trained Language Models [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 3377 - 3390
- [3] ReAugKD: Retrieval-Augmented Knowledge Distillation For Pre-trained Language Models [J]. 61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1128 - 1136
- [4] Knowledge Inheritance for Pre-trained Language Models [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3921 - 3937
- [5] PANLP at MEDIQA 2019: Pre-trained Language Models, Transfer Learning and Knowledge Distillation [J]. SIGBIOMED WORKSHOP ON BIOMEDICAL NATURAL LANGUAGE PROCESSING (BIONLP 2019), 2019, : 380 - 388
- [6] MERGEDISTILL: Merging Pre-trained Language Models using Distillation [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2874 - 2887
- [7] Probing Pre-Trained Language Models for Disease Knowledge [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3023 - 3033
- [9] "You are grounded!": Latent Name Artifacts in Pre-trained Language Models [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6850 - 6861
- [10] KroneckerBERT: Significant Compression of Pre-trained Language Models Through Kronecker Decomposition and Knowledge Distillation [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2116 - 2127