Test-time Adaptation for Machine Translation Evaluation by Uncertainty Minimization

被引:0
|
作者
Zhan, Runzhe [1 ]
Liu, Xuebo [2 ]
Wong, Derek F. [2 ]
Zhang, Cuilian [1 ]
Chao, Lidia S. [1 ]
Zhang, Min
机构
[1] Univ Macau, Dept Comp & Informat Sci, NLP2CT Lab, Taipa, Macau, Peoples R China
[2] Harbin Inst Technol, Inst Comp & Intelligence, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The neural metrics recently received considerable attention from the research community in the automatic evaluation of machine translation. Unlike text-based metrics that have interpretable and consistent evaluation mechanisms for various data sources, the reliability of neural metrics in assessing out-of-distribution data remains a concern due to the disparity between training data and real-world data. This paper aims to address the inference bias of neural metrics through uncertainty minimization during test time, without requiring additional data. Our proposed method comprises three steps: uncertainty estimation, test-time adaptation, and inference. Specifically, the model employs the prediction uncertainty of the current data as a signal to update a small fraction of parameters during test time and subsequently refine the prediction through optimization. To validate our approach, we apply the proposed method to three representative models and conduct experiments on the WMT21 benchmarks. The results obtained from both in-domain and out-of-distribution evaluations consistently demonstrate improvements in correlation performance across different models. Furthermore, we provide evidence that the proposed method effectively reduces model uncertainty. The code is publicly available at https://github.com/NLP2CT/TaU.
引用
收藏
页码:807 / 820
页数:14
相关论文
共 50 条
  • [41] Robust Mean Teacher for Continual and Gradual Test-Time Adaptation
    Doebler, Mario
    Marsden, Robert A.
    Yang, Bin
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7704 - 7714
  • [42] On-the-Fly Test-time Adaptation for Medical Image Segmentation
    Valanarasu, Jeya Maria Jose
    Guo, Pengfei
    Vibashan, V. S.
    Patel, Vishal M.
    MEDICAL IMAGING WITH DEEP LEARNING, VOL 227, 2023, 227 : 586 - 598
  • [43] Exploring Safety Supervision for Continual Test-time Domain Adaptation
    Yang, Xu
    Gu, Yanan
    Wei, Kun
    Deng, Cheng
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 1649 - 1657
  • [44] SoTTA: Robust Test-Time Adaptation on Noisy Data Streams
    Gong, Taesik
    Kim, Yewon
    Lee, Taeckyung
    Chottananurak, Sorn
    Lee, Sung-Ju
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [45] Self-supervised Test-time Adaptation on Video Data
    Azimi, Fatemeh
    Palacio, Sebastian
    Raue, Federico
    Hees, Joern
    Bertinetto, Luca
    Dengel, Andreas
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2603 - 2612
  • [46] Online Adaptive Fault Diagnosis With Test-Time Domain Adaptation
    Wu, Kangkai
    Li, Jingjing
    Meng, Lichao
    Li, Fengling
    Lu, Ke
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2025, 21 (01) : 107 - 117
  • [47] Test-time adaptation for 6D pose tracking
    Tian, Long
    Oh, Changjae
    Cavallaro, Andrea
    PATTERN RECOGNITION, 2024, 152
  • [48] Noise-Robust Continual Test-Time Domain Adaptation
    Yu, Zhiqi
    Li, Jingjing
    Du, Zhekai
    Li, Fengling
    Zhu, Lei
    Yang, Yang
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2654 - 2662
  • [49] DARTH: Holistic Test-time Adaptation for Multiple Object Tracking
    Segu, Mattia
    Schiele, Bernt
    Yu, Fisher
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 9683 - 9693
  • [50] Test-Time Adaptation via Conjugate Pseudo-labels
    Goyal, Sachin
    Sun, Mingjie
    Raghunathan, Aditi
    Kolter, Zico
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,