ShortcutLens: A Visual Analytics Approach for Exploring Shortcuts in Natural Language Understanding Dataset

被引:4
|
作者
Jin, Zhihua [1 ]
Wang, Xingbo [1 ]
Cheng, Furui [1 ,2 ]
Sun, Chunhui [3 ]
Liu, Qun [4 ]
Qu, Huamin [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[2] Swiss Fed Inst Technol, CH-8092 Zurich, Switzerland
[3] Peking Univ, Beijing 100871, Peoples R China
[4] Huawei, Noahs Ark Lab, Hong Kong, Peoples R China
关键词
Benchmark testing; Task analysis; Natural language processing; Cognition; Guidelines; Predictive models; Computational modeling; Natural language understanding; shortcut; visual analytics;
D O I
10.1109/TVCG.2023.3236380
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Benchmark datasets play an important role in evaluating Natural Language Understanding (NLU) models. However, shortcuts-unwanted biases in the benchmark datasets-can damage the effectiveness of benchmark datasets in revealing models' real capabilities. Since shortcuts vary in coverage, productivity, and semantic meaning, it is challenging for NLU experts to systematically understand and avoid them when creating benchmark datasets. In this paper, we develop a visual analytics system, ShortcutLens, to help NLU experts explore shortcuts in NLU benchmark datasets. The system allows users to conduct multi-level exploration of shortcuts. Specifically, Statistics View helps users grasp the statistics such as coverage and productivity of shortcuts in the benchmark dataset. Template View employs hierarchical and interpretable templates to summarize different types of shortcuts. Instance View allows users to check the corresponding instances covered by the shortcuts. We conduct case studies and expert interviews to evaluate the effectiveness and usability of the system. The results demonstrate that ShortcutLens supports users in gaining a better understanding of benchmark dataset issues through shortcuts, inspiring them to create challenging and pertinent benchmark datasets.
引用
收藏
页码:3594 / 3608
页数:15
相关论文
共 50 条
  • [41] A Non-Biological AI Approach towards Natural Language Understanding
    Stephen, Lernout
    Geert, Devos
    Andreas, Kraze
    Frank, Platteau
    [J]. PROCEEDINGS OF 2016 FUTURE TECHNOLOGIES CONFERENCE (FTC), 2016, : 1300 - 1302
  • [42] Comparison of Full-text Articles and Abstracts for Visual Trend Analytics through Natural Language Processing
    Nazemi, Kawa
    Klepsch, Maike J.
    Burkhardt, Dirk
    Kaupp, Lukas
    [J]. 2020 24TH INTERNATIONAL CONFERENCE INFORMATION VISUALISATION (IV 2020), 2020, : 360 - 367
  • [43] Exploring Multi-Scale Spatiotemporal Twitter User Mobility Patterns with a Visual-Analytics Approach
    Yin, Junjun
    Du, Zhenhong
    [J]. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2016, 5 (10)
  • [44] IsNL? A Discriminative Approach to Detect Natural Language Like Queries for Conversational Understanding
    Celikyilmaz, Ash
    Tur, Gokhan
    Hakkani-Tuer, Dilek
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2568 - 2572
  • [45] A Natural Language Understanding Approach Toward Extraction of Specifications from Request for Proposals
    Saha, Barun Kumar
    Haab, Luca
    Tandur, Deepaknath
    [J]. 2023 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION, ICAIIC, 2023, : 205 - 210
  • [46] ANALYSIS OF NATURAL LANGUAGE: A NOVEL APPROACH TO UNDERSTANDING THE THEMES OF BREAST CANCER SURVIVORSHIP
    Stanton, Amelia
    Currin-McCulloch, Jennifer
    Jones, Barbara
    [J]. ANNALS OF BEHAVIORAL MEDICINE, 2017, 51 : S63 - S63
  • [47] A Natural-language-based Visual Query Approach of Uncertain Human Trajectories
    Huang, Zhaosong
    Zhao, Ye
    Chen, Wei
    Gao, Shengjie
    Yu, Kejie
    Xu, Weixia
    Tang, Mingjie
    Zhu, Minfeng
    Xu, Mingliang
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2020, 26 (01) : 1256 - 1266
  • [48] Natural Language Processing based Visual Question Answering Efficient: an EfficientDet Approach
    Gupta, Rahul
    Hooda, P. Arikshit
    Sanjeev
    Chikkara, Nikhil Kumar
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS 2020), 2020, : 900 - 904
  • [49] A Visual Analytics Approach to Exploring the Feature and Label Space Based on Semi-structured Electronic Medical Records
    He Wang
    Yang Ouyang
    Quan Li
    [J]. 2023 WORKSHOP ON VISUAL ANALYTICS IN HEALTHCARE, VAHC, 2023, : 44 - 46
  • [50] Exploring natural language processing techniques to extract semantics from unstructured dataset which will aid in effective semantic interlinking
    Aladakatti, Shweta S.
    Kumar, S. Senthil
    [J]. INTERNATIONAL JOURNAL OF MODELING SIMULATION AND SCIENTIFIC COMPUTING, 2023, 14 (01)