QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search

被引:0
|
作者
Xie, Jian [1 ]
Liang, Yidan [2 ]
Liu, Jingping [3 ]
Xiao, Yanghua [1 ]
Wu, Baohua [2 ]
Ni, Shenghua [2 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Data Sci, Shanghai, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
[3] East China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai, Peoples R China
关键词
Continual Pre-training; Query Understanding; Travel Domain Search;
D O I
10.1145/3580305.3599891
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In light of the success of the pre-trained language models (PLMs), continual pre-training of generic PLMs has been the paradigm of domain adaption. In this paper, we propose QUERT, A Continual Pre-trained Language Model for QUERy Understanding in Travel Domain Search. QUERT is jointly trained on four tailored pre-training tasks to the characteristics of query in travel domain search: Geography-aware Mask Prediction, Geohash Code Prediction, User Click Behavior Learning, and Phrase and Token Order Prediction. Performance improvement of downstream tasks and ablation experiment demonstrate the effectiveness of our proposed pre-training tasks. To be specific, the average performance of downstream tasks increases by 2.02% and 30.93% in supervised and unsupervised settings, respectively. To check on the improvement of QUERT to online business, we deploy QUERT and perform A/B testing on Fliggy APP. The feedback results show that QUERT increases the Unique Click-Through Rate and Page Click-Through Rate by 0.89% and 1.03% when applying QUERT as the encoder. Resources are available at https://github.com/hsaest/QUERT.
引用
收藏
页码:5282 / 5291
页数:10
相关论文
共 50 条
  • [41] A Domain-adaptive Pre-training Approach for Language Bias Detection in News
    Krieger, Jan-David
    Spinde, Timo
    Ruas, Terry
    Kulshrestha, Juhi
    Gipp, Bela
    2022 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL), 2022,
  • [42] Pre-training for Spoken Language Understanding with Joint Textual and Phonetic Representation Learning
    Chen, Qian
    Wang, Wen
    Zhang, Qinglin
    INTERSPEECH 2021, 2021, : 1244 - 1248
  • [43] Removing Backdoors in Pre-trained Models by Regularized Continual Pre-training
    Zhu, Biru
    Cui, Ganqu
    Chen, Yangyi
    Qin, Yujia
    Yuan, Lifan
    Fu, Chong
    Deng, Yangdong
    Liu, Zhiyuan
    Sun, Maosong
    Gu, Ming
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 1608 - 1623
  • [44] Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
    Zhuge, Mingchen
    Gao, Dehong
    Fan, Deng-Ping
    Jin, Linbo
    Chen, Ben
    Zhou, Haoming
    Qiu, Minghui
    Shao, Ling
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12642 - 12652
  • [45] MarkupLM: Pre-training of Text and Markup Language for Visually Rich Document Understanding
    Li, Junlong
    Xu, Yiheng
    Cui, Lei
    Wei, Furu
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 6078 - 6087
  • [46] Continual Mixed-Language Pre-Training for Extremely Low-Resource Neural Machine Translation
    Liu, Zihan
    Winata, Genta Indra
    Fung, Pascale
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2706 - 2718
  • [47] A New Pre-training Method for Training Deep Learning Models with Application to Spoken Language Understanding
    Celikyilmaz, Asli
    Sarikaya, Ruhi
    Hakkani-Tur, Dilek
    Liu, Xiaohu
    Ramesh, Nikhil
    Tur, Gokhan
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3255 - 3259
  • [48] Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding
    Li, Shiyang
    Yavuz, Semih
    Chen, Wenhu
    Yan, Xifeng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1006 - 1015
  • [49] MGeo: Multi-Modal Geographic Language Model Pre-Training
    Ding, Ruixue
    Chen, Boli
    Xie, Pengjun
    Huang, Fei
    Li, Xin
    Zhang, Qiang
    Xu, Yao
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 185 - 194
  • [50] Knowledge distilled pre-training model for vision-language-navigation
    Bo Huang
    Shuai Zhang
    Jitao Huang
    Yijun Yu
    Zhicai Shi
    Yujie Xiong
    Applied Intelligence, 2023, 53 : 5607 - 5619