共 6 条
- [2] BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension.[J] . Mike Lewis,Yinhan Liu,Naman Goyal,Marjan Ghazvininejad,Abdelrahman Mohamed,Omer Levy,Veselin Stoyanov,Luke Zettlemoyer. CoRR . 2019
- [3] A Neural Probabilistic Language Model.[J] . Yoshua Bengio,Réjean Ducharme,Pascal Vincent,Christian Janvin. Journal of Machine Learning Research . 2003
- [4] Language models are unsupervised multitask learners .2 RADFORD A,JEFFREY W,CHILD R,et al. https://cdn.openai.com/better-language-models/languagemodelsareunsupervisedmultitasklearners.pdf . 2022
- [5] BERT:pre-training of deep bidirectional transformers for language understanding .2 DEVLIN J,CHANG M W,LEE K,et al. https://aclanthology.org/N19-1423.pdf . 2022
- [6] XLNet:generalized autoregressive pretraining for language understanding .2 YANG Z L,DAI Z H,YANG Y M,et al. https://arxiv.org/abs/1906.08237 . 2022