Aligning Human and Computational Coherence Evaluations

被引:0
|
作者
Lim, Jia Peng [1 ]
Lauw, Hady W. [1 ]
机构
[1] Singapore Management Univ, Sch Comp & Informat Syst, PreferredAI Res Grp, Singapore, Singapore
基金
新加坡国家研究基金会;
关键词
VOCABULARY;
D O I
10.1162/coli_a_00518
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automated coherence metrics constitute an efficient and popular way to evaluate topic models. Previous work presents a mixed picture of their presumed correlation with human judgment. This work proposes a novel sampling approach to mining topic representations at a large scale while seeking to mitigate bias from sampling, enabling the investigation of widely used automated coherence metrics via large corpora. Additionally, this article proposes a novel user study design, an amalgamation of different proxy tasks, to derive a finer insight into the human decision-making processes. This design subsumes the purpose of simple rating and outlier-detection user studies. Similar to the sampling approach, the user study conducted is extensive, comprising 40 study participants split into eight different study groups tasked with evaluating their respective set of 100 topic representations. Usually, when substantiating the use of these metrics, human responses are treated as the gold standard. This article further investigates the reliability of human judgment by flipping the comparison and conducting a novel extended analysis of human response at the group and individual level against a generic corpus. The investigation results show a moderate to good correlation between these metrics and human judgment, especially for generic corpora, and derive further insights into the human perception of coherence. Analyzing inter-metric correlations across corpora shows moderate to good correlation among these metrics. As these metrics depend on corpus statistics, this article further investigates the topical differences between corpora, revealing nuances in applications of these metrics.
引用
收藏
页码:893 / 952
页数:60
相关论文
共 50 条
  • [21] Computational methods for analysis of human breast tumor tissue in optical coherence tomography images
    Zysk, Adam M.
    Boppart, Stephen A.
    JOURNAL OF BIOMEDICAL OPTICS, 2006, 11 (05)
  • [22] Tasks for aligning human and machine planning
    van Opheusden, Bas
    Ma, Wei Ji
    CURRENT OPINION IN BEHAVIORAL SCIENCES, 2019, 29 : 127 - 133
  • [23] Aligning Regulations and Ethics in Human Research
    Dresser, Rebecca
    SCIENCE, 2012, 337 (6094) : 527 - 528
  • [24] Computational optical coherence tomography [Invited]
    Liu, Yuan-Zhi
    South, Fredrick A.
    Xu, Yang
    Carney, P. Scott
    Boppart, Stephen A.
    BIOMEDICAL OPTICS EXPRESS, 2017, 8 (03): : 1549 - 1574
  • [25] The Computational Meaning of Probabilistic Coherence Spaces
    Ehrhard, Thomas
    Pagani, Michele
    Tasson, Christine
    26TH ANNUAL IEEE SYMPOSIUM ON LOGIC IN COMPUTER SCIENCE (LICS 2011), 2011, : 87 - 96
  • [26] Electroretinographic and Optical Coherence Tomographic Evaluations of Eyes with Vitreoretinal Lymphoma
    Makita, Jun
    Yoshikawa, Yuji
    Kanno, Junji
    Igawa, Yuro
    Kumagai, Tomoyuki
    Takano, Shunichiro
    Katsumoto, Takeshi
    Shoji, Takuhei
    Shibuya, Masayuki
    Shinoda, Kei
    JOURNAL OF CLINICAL MEDICINE, 2023, 12 (12)
  • [27] Updating freeze: Aligning animal and human research
    Hagenaars, Muriel A.
    Oitzl, Melly
    Roelofs, Karin
    NEUROSCIENCE AND BIOBEHAVIORAL REVIEWS, 2014, 47 : 165 - 176
  • [28] Aligning Diffusion Models by Optimizing Human Utility
    Li, Shufan
    Kallidromitis, Konstantinos
    Gokul, Akash
    Kato, Yusuke
    Kozuka, Kazuki
    arXiv,
  • [29] EXPERIMENTAL AND COMPUTATIONAL EVALUATIONS OF ISOTHERMALIZED STIRLING ENGINES.
    Martini, W.R.
    Hauser, S.G.
    Martini, M.W.
    Proceedings of the Intersociety Energy Conversion Engineering Conference, 1977, : 1496 - 1503
  • [30] Practical software for aligning ESTs to human genome
    Ogasawara, J
    Morishita, S
    COMBINATORIAL PATTERN MATCHING, 2002, 2373 : 1 - 16