QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search

被引:0
|
作者
Xie, Jian [1 ]
Liang, Yidan [2 ]
Liu, Jingping [3 ]
Xiao, Yanghua [1 ]
Wu, Baohua [2 ]
Ni, Shenghua [2 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Data Sci, Shanghai, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
[3] East China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai, Peoples R China
关键词
Continual Pre-training; Query Understanding; Travel Domain Search;
D O I
10.1145/3580305.3599891
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In light of the success of the pre-trained language models (PLMs), continual pre-training of generic PLMs has been the paradigm of domain adaption. In this paper, we propose QUERT, A Continual Pre-trained Language Model for QUERy Understanding in Travel Domain Search. QUERT is jointly trained on four tailored pre-training tasks to the characteristics of query in travel domain search: Geography-aware Mask Prediction, Geohash Code Prediction, User Click Behavior Learning, and Phrase and Token Order Prediction. Performance improvement of downstream tasks and ablation experiment demonstrate the effectiveness of our proposed pre-training tasks. To be specific, the average performance of downstream tasks increases by 2.02% and 30.93% in supervised and unsupervised settings, respectively. To check on the improvement of QUERT to online business, we deploy QUERT and perform A/B testing on Fliggy APP. The feedback results show that QUERT increases the Unique Click-Through Rate and Page Click-Through Rate by 0.89% and 1.03% when applying QUERT as the encoder. Resources are available at https://github.com/hsaest/QUERT.
引用
收藏
页码:5282 / 5291
页数:10
相关论文
共 50 条
  • [1] ERNIE 2.0: A Continual Pre-Training Framework for Language Understanding
    Sun, Yu
    Wang, Shuohuan
    Li, Yukun
    Feng, Shikun
    Tian, Hao
    Wu, Hua
    Wang, Haifeng
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8968 - 8975
  • [2] PRE-TRAINING FOR QUERY REWRITING IN A SPOKEN LANGUAGE UNDERSTANDING SYSTEM
    Chen, Zheng
    Fan, Xing
    Ling, Yuan
    Mathias, Lambert
    Guo, Chenlci
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7969 - 7973
  • [3] Continual Pre-Training of Python Language Model to mT5
    Kajiura, Teruno
    Souma, Nao
    Sato, Miyu
    Kuramitsu, Kimio
    Computer Software, 2023, 40 (04): : 10 - 21
  • [4] Continual pre-training mitigates forgetting in language and vision
    Cossu A.
    Carta A.
    Passaro L.
    Lomonaco V.
    Tuytelaars T.
    Bacciu D.
    Neural Networks, 2024, 179
  • [5] Unified Language Model Pre-training for Natural Language Understanding and Generation
    Dong, Li
    Yang, Nan
    Wang, Wenhui
    Wei, Furu
    Liu, Xiaodong
    Wang, Yu
    Gao, Jianfeng
    Zhou, Ming
    Hon, Hsiao-Wuen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [6] Subset selection for domain adaptive pre-training of language model
    JunHa Hwang
    SeungDong Lee
    HaNeul Kim
    Young-Seob Jeong
    Scientific Reports, 15 (1)
  • [7] Geo-BERT Pre-training Model for Query Rewriting in POI Search
    Liu, Xiao
    Hu, Juan
    Shen, Qi
    Chen, Huan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2209 - 2214
  • [8] Continual Pre-training of Language Models for Math Problem Understanding with Syntax-Aware Memory Network
    Gong, Zheng
    Zhou, Kun
    Zhao, Wayne Xin
    Sha, Jing
    Wang, Shijin
    Wen, Ji-Rong
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5923 - 5933
  • [9] MPNet: Masked and Permuted Pre-training for Language Understanding
    Song, Kaitao
    Tan, Xu
    Qin, Tao
    Lu, Jianfeng
    Liu, Tie-Yan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [10] Speech Model Pre-training for End-to-End Spoken Language Understanding
    Lugosch, Loren
    Ravanelli, Mirco
    Ignoto, Patrick
    Tomar, Vikrant Singh
    Bengio, Yoshua
    INTERSPEECH 2019, 2019, : 814 - 818