STYLEDGPT: Stylized Response Generation with Pre-trained Language Models

被引：0

作者：

Yang, Ze ^{[1
]}

Wu, Wei ^{[2
]}

Xu, Can ^{[3
]}

Liang, Xinnian ^{[1
]}

Bai, Jiaqi ^{[1
]}

Wang, Liran ^{[1
]}

Wang, Wei ^{[4
]}

Li, Zhoujun ^{[1
]}

机构：

[1] Beihang Univ, State Key Lab Software Dev Environm, Beijing, Peoples R China

[2] Meituan, Beijing, Peoples R China

[3] Microsoft, Beijing, Peoples R China

[4] China Resources Grp, Hong Kong, Peoples R China

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020 | 2020年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Generating responses following a desired style has great potentials to extend applications of open-domain dialogue systems, yet is refrained by lacking of parallel data for training. In this work, we explore the challenging task with pre-trained language models that have brought breakthrough to various natural language tasks. To this end, we introduce a KL loss and a style classifier to the fine-tuning step in order to steer response generation towards the target style in both a word-level and a sentence-level. Comprehensive empirical studies with two public datasets indicate that our model can significantly outperform state-of-the-art methods in terms of both style consistency and contextual coherence.

引用

页码：1548 / 1559

页数：12

共 50 条

[41] Memorisation versus Generalisation in Pre-trained Language Models
Tanzer, Michael
Ruder, Sebastian
Rei, Marek
[J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7564 - 7578
[42] Evaluating the Summarization Comprehension of Pre-Trained Language Models
Chernyshev, D. I.
Dobrov, B. V.
[J]. LOBACHEVSKII JOURNAL OF MATHEMATICS, 2023, 44 (08) : 3028 - 3039
[43] Pre-trained models for natural language processing: A survey
QIU XiPeng
SUN TianXiang
XU YiGe
SHAO YunFan
DAI Ning
HUANG XuanJing
[J]. Science China Technological Sciences, 2020, 63 (10) : 1872 - 1897
[44] Understanding Online Attitudes with Pre-Trained Language Models
Power, William
Obradovic, Zoran
[J]. PROCEEDINGS OF THE 2023 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2023, 2023, : 745 - 752
[45] Compressing Pre-trained Language Models by Matrix Decomposition
Ben Noach, Matan
Goldberg, Yoav
[J]. 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 884 - 889
[46] On the Sentence Embeddings from Pre-trained Language Models
Li, Bohan
Zhou, Hao
He, Junxian
Wang, Mingxuan
Yang, Yiming
Li, Lei
[J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9119 - 9130
[47] Pre-trained language models for keyphrase prediction: A review
Umair, Muhammad
Sultana, Tangina
Lee, Young-Koo
[J]. ICT EXPRESS, 2024, 10 (04): : 871 - 890
[48] Pre-trained models for natural language processing: A survey
QIU XiPeng
SUN TianXiang
XU YiGe
SHAO YunFan
DAI Ning
HUANG XuanJing
[J]. Science China(Technological Sciences), 2020, (10) : 1872 - 1897
[49] Evaluating and Inducing Personality in Pre-trained Language Models
Jiang, Guangyuan
Xu, Manjie
Zhu, Song-Chun
Han, Wenjuan
Zhang, Chi
Zhu, Yixin
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[50] Evaluating the Summarization Comprehension of Pre-Trained Language Models
D. I. Chernyshev
B. V. Dobrov
[J]. Lobachevskii Journal of Mathematics, 2023, 44 : 3028 - 3039

← 1 2 3 4 5 →