Test-time Adaptation for Machine Translation Evaluation by Uncertainty Minimization

被引:0
|
作者
Zhan, Runzhe [1 ]
Liu, Xuebo [2 ]
Wong, Derek F. [2 ]
Zhang, Cuilian [1 ]
Chao, Lidia S. [1 ]
Zhang, Min
机构
[1] Univ Macau, Dept Comp & Informat Sci, NLP2CT Lab, Taipa, Macau, Peoples R China
[2] Harbin Inst Technol, Inst Comp & Intelligence, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The neural metrics recently received considerable attention from the research community in the automatic evaluation of machine translation. Unlike text-based metrics that have interpretable and consistent evaluation mechanisms for various data sources, the reliability of neural metrics in assessing out-of-distribution data remains a concern due to the disparity between training data and real-world data. This paper aims to address the inference bias of neural metrics through uncertainty minimization during test time, without requiring additional data. Our proposed method comprises three steps: uncertainty estimation, test-time adaptation, and inference. Specifically, the model employs the prediction uncertainty of the current data as a signal to update a small fraction of parameters during test time and subsequently refine the prediction through optimization. To validate our approach, we apply the proposed method to three representative models and conduct experiments on the WMT21 benchmarks. The results obtained from both in-domain and out-of-distribution evaluations consistently demonstrate improvements in correlation performance across different models. Furthermore, we provide evidence that the proposed method effectively reduces model uncertainty. The code is publicly available at https://github.com/NLP2CT/TaU.
引用
收藏
页码:807 / 820
页数:14
相关论文
共 50 条
  • [31] Exploring Motion Cues for Video Test-Time Adaptation
    Zeng, Runhao
    Deng, Qi
    Xu, Huixuan
    Niu, Shuaicheng
    Chen, Jian
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1840 - 1850
  • [32] Test-Time Adaptation with Shape Moments for Image Segmentation
    Bateson, Mathilde
    Lombaert, Herve
    Ben Ayed, Ismail
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT IV, 2022, 13434 : 736 - 745
  • [33] Multi-source fully test-time adaptation
    Du, Yuntao
    Luo, Siqi
    Xin, Yi
    Chen, Mingcai
    Feng, Shuai
    Zhang, Mujie
    Wang, Chonngjun
    NEURAL NETWORKS, 2025, 181
  • [34] MedBN: Robust Test-Time Adaptation against Malicious Test Samples
    Park, Hyejin
    Hwang, Jeongyeon
    Mun, Sunung
    Park, Sangdon
    Ok, Jungseul
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 5997 - 6007
  • [35] SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization
    Kim, Changhun
    Park, Joonhyung
    Shim, Hajin
    Yang, Eunho
    INTERSPEECH 2023, 2023, : 3367 - 3371
  • [36] Category-Aware Test-Time Training Domain Adaptation
    Feng, Yangqin
    Xu, Xinxing
    Fu, Huazhu
    Wang, Yan
    Wang, Zizhou
    Zhen, Liangli
    Goh, Rick Siow Mong
    Liu, Yong
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 300 - 306
  • [37] Multiple Teacher Model for Continual Test-Time Domain Adaptation
    Wang, Ran
    Zuo, Hua
    Fang, Zhen
    Lu, Jie
    ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2023, PT I, 2024, 14471 : 304 - 314
  • [38] Compression and restoration: exploring elasticity in continual test-time adaptation
    Li, Jingwei
    Liu, Chengbao
    Bai, Xiwei
    Tan, Jie
    Chu, Jiaqi
    Wang, Yudong
    MACHINE LEARNING, 2025, 114 (04)
  • [39] A Comprehensive Survey on Test-Time Adaptation Under Distribution Shifts
    Liang, Jian
    He, Ran
    Tan, Tieniu
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (01) : 31 - 64
  • [40] Uncertainty guided test-time training for face forgery detection
    Xu, Pengxiang
    He, Yang
    Yang, Jian
    Zhang, Shanshan
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249