Semantic Concept Spaces: Guided Topic Model Refinement using Word-Embedding Projections

被引:27
|
作者
El-Assady, Mennatallah [1 ,2 ]
Kehlbeck, Rebecca [1 ]
Collins, Christopher [2 ]
Keim, Daniel [1 ]
Deussen, Oliver [1 ]
机构
[1] Univ Konstanz, Constance, Germany
[2] Ontario Tech Univ, Oshawa, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Topic Model Optimization; Word Embedding; Mixed-Initiative Refinement; Guided Visual Analytics; Semantic Mapping; VISUAL ANALYTICS;
D O I
10.1109/TVCG.2019.2934654
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a framework that allows users to incorporate the semantics of their domain knowledge for topic model refinement while remaining model-agnostic. Our approach enables users to (1) understand the semantic space of the model, (2) identify regions of potential conflicts and problems, and (3) readjust the semantic relation of concepts based on their understanding, directly influencing the topic modeling. These tasks are supported by an interactive visual analytics workspace that uses word-embedding projections to define concept regions which can then be refined. The user-refined concepts are independent of a particular document collection and can be transferred to related corpora. All user interactions within the concept space directly affect the semantic relations of the underlying vector space model, which, in turn, change the topic modeling. In addition to direct manipulation, our system guides the users decision-making process through recommended interactions that point out potential improvements. This targeted refinement aims at minimizing the feedback required for an efficient human-in-the-loop process. We confirm the improvements achieved through our approach in two user studies that show topic model quality improvements through our visual knowledge externalization and learning process.
引用
收藏
页码:1001 / 1011
页数:11
相关论文
共 50 条
  • [31] Semantic Service Alignment Using Concept Description Refinement
    Letia, Ioan Alfred
    Pop, Octavian
    12TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2010), 2011, : 561 - 564
  • [32] Identifying and Lnderstanding Business Trends using Topic Models with Word Embedding
    Pek, Yun Ning
    Lim, Kwan Hui
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 6177 - 6179
  • [33] Predicting Gene Functional Interactions Using Semantic Word Embedding
    Roy, Arpita
    Pan, Shimei
    2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 509 - 513
  • [34] COVID-19 Article Classification Using Word-Embedding and Extreme Learning Machine with Various Kernels
    Vijayvargiya, Sanidhya
    Kumar, Lov
    Malapati, Aruna
    Murthy, Lalita Bhanu
    Krishna, Aneesh
    ADVANCED INFORMATION NETWORKING AND APPLICATIONS, AINA-2022, VOL 3, 2022, 451 : 69 - 81
  • [35] Integrating Text Classification into Topic Discovery Using Semantic Embedding Models
    Lezama-Sanchez, Ana Laura
    Vidal, Mireya Tovar
    Reyes-Ortiz, Jose A.
    APPLIED SCIENCES-BASEL, 2023, 13 (17):
  • [36] Modelling the Semantic Change Dynamics using Diachronic Word Embedding
    Boukhaled, Mohamed Amine
    Fagard, Benjamin
    Poibeau, Thierry
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 944 - 951
  • [37] The Semantic Similarity Relation of Entities Discovery: Using Word Embedding
    Ruan, Dong-ru
    Mao, Yu-xin
    Pan, Hong-yan
    Gao, Kai
    2017 9TH INTERNATIONAL CONFERENCE ON MODELLING, IDENTIFICATION AND CONTROL (ICMIC 2017), 2017, : 845 - 850
  • [39] A Correlated Topic Model Using Word Embeddings
    Xun, Guangxu
    Li, Yaliang
    Zhao, Wayne Xin
    Gao, Jing
    Zhang, Aidong
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4207 - 4213
  • [40] Adding semantic retrieving concept model into the discussion board of learning community by using topic maps
    Yang, SJH
    Fan, TCW
    Chen, IYL
    5TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES, PROCEEDINGS, 2005, : 122 - 126