Is my stance the same as your stance? A cross validation study of stance detection datasets

被引:0
|
作者
Ng, Lynnette Hui Xian [1 ]
Carley, Kathleen M. [1 ]
机构
[1] Carnegie Mellon Univ, CASOS, Inst Software Res, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
基金
美国安德鲁·梅隆基金会;
关键词
Stance detection; Natural language processing; Cross validation; Machine learning; Twitter;
D O I
10.1016/j.ipm.2022.103070
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Stance detection identifies a person's evaluation of a subject, and is a crucial component for many downstream applications. In application, stance detection requires training a machine learning model on an annotated dataset and applying the model on another to predict stances of text snippets. This cross-dataset model generalization poses three central questions, which we investigate using stance classification models on 7 publicly available English Twitter datasets ranging from 297 to 48,284 instances. (1) Are stance classification models generalizable across datasets? We construct a single dataset model to train/test dataset-against-dataset, finding models do not generalize well (avg F1=0.33). (2) Can we improve the generalizability by aggregating datasets? We find a multi dataset model built on the aggregation of datasets has an improved performance (avg F1=0.69). (3) Given a model built on multiple datasets, how much additional data is required to fine-tune it? We find it challenging to ascertain a minimum number of data points due to the lack of pattern in performance. Investigating possible reasons for the choppy model performance we find that texts are not easily differentiable by stances, nor are annotations consistent within and across datasets. Our observations emphasize the need for an aggregated dataset as well as consistent labels for the generalizability of models.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] That is your evidence?: Classifying stance in online political debate
    Walker, Marilyn A.
    Anand, Pranav
    Abbott, Rob
    Tree, Jean E. Fox
    Martell, Craig
    King, Joseph
    [J]. DECISION SUPPORT SYSTEMS, 2012, 53 (04) : 719 - 729
  • [42] SEGP: Stance-Emotion Joint Data Augmentation with Gradual Prompt-Tuning for Stance Detection
    Wang, Junlin
    Zhou, Yan
    Liu, Yaxin
    Zhang, Weibo
    Hu, Songlin
    [J]. COMPUTATIONAL SCIENCE - ICCS 2022, PT III, 2022, 13352 : 577 - 590
  • [43] Target-adaptive Graph for Cross-target Stance Detection
    Liang, Bin
    Fu, Yonghao
    Gui, Lin
    Yang, Min
    Du, Jiachen
    He, Yulan
    Xu, Ruifeng
    [J]. PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 3453 - 3464
  • [44] STANCE AND CULTURE: A COMPARATIVE STUDY OF ENGLISH AND PERSIAN AUTHORIAL STANCE IN APPLIED LINGUISTICS RESEARCH ARTICLES
    Hashemi, Mohammad Reza
    Hosseini, Hosna
    [J]. ADVANCED EDUCATION, 2019, (12) : 21 - 27
  • [45] QUANTITATIVE STUDY OF STANCE IN NORMAL SUBJECTS
    STRIBLEY, RF
    ALBERS, JW
    TOURTELLOTTE, WW
    COCKRELL, JL
    [J]. ARCHIVES OF PHYSICAL MEDICINE AND REHABILITATION, 1974, 55 (02): : 74 - 80
  • [46] Whose stance is it, anyway? A corpus-based study of stance expressions in science news articles
    Batchelor, Jordan
    [J]. IBERICA, 2024, (47):
  • [47] Stance Detection in Chinese MicroBlogs with Neural Networks
    Yu, Nan
    Pan, Da
    Zhang, Meishan
    Fu, Guohong
    [J]. NATURAL LANGUAGE UNDERSTANDING AND INTELLIGENT APPLICATIONS (NLPCC 2016), 2016, 10102 : 893 - 900
  • [48] Selecting an Optimal Feature Set for Stance Detection
    Vychegzhanin, Sergey
    Razova, Elena
    Kotelnikov, Evgeny
    Milov, Vladimir
    [J]. ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS, AIST 2019, 2019, 11832 : 242 - 253
  • [49] Some Suggestions for the Study of Stance in Communication
    Chindamo, Massimo
    Allwood, Jens
    Ahlsen, Elisabeth
    [J]. PROCEEDINGS OF 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY, RISK AND TRUST AND 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING (SOCIALCOM/PASSAT 2012), 2012, : 617 - 622
  • [50] Physiological study of the vertical stance of man
    Hellebrandt, FA
    Franseen, EB
    [J]. PHYSIOLOGICAL REVIEWS, 1943, 23 (03) : 0220 - 0255