TopicLens: Efficient Multi-Level Visual Topic Exploration of Large-Scale Document Collections

被引:59
|
作者
Kim, Minjeong [1 ]
Kang, Kyeongpil [1 ]
Park, Deokgun [2 ]
Choo, Jaegul [1 ]
Elmqvist, Niklas [2 ]
机构
[1] Korea Univ, Seoul, South Korea
[2] Univ Maryland, College Pk, MD 20742 USA
基金
新加坡国家研究基金会;
关键词
topic modeling; nonnegative matrix factorization; t-distributed stochastic neighbor embedding; magic lens; text analytics; NONNEGATIVE MATRIX; VISUALIZATION; ANALYTICS;
D O I
10.1109/TVCG.2016.2598445
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Topic modeling, which reveals underlying topics of a document corpus, has been actively adopted in visual analytics for large-scale document collections. However, due to its significant processing time and non-interactive nature, topic modeling has so far not been tightly integrated into a visual analytics workflow. Instead, most such systems are limited to utilizing a fixed, initial set of topics. Motivated by this gap in the literature, we propose a novel interaction technique called TopicLens that allows a user to dynamically explore data through a lens interface where topic modeling and the corresponding 2D embedding are efficiently computed on the fly. To support this interaction in real time while maintaining view consistency, we propose a novel efficient topic modeling method and a semi-supervised 2D embedding algorithm. Our work is based on improving state-of-the-art methods such as nonnegative matrix factorization and t-distributed stochastic neighbor embedding. Furthermore, we have built a web-based visual analytics system integrated with TopicLens. We use this system to measure the performance and the visualization quality of our proposed methods. We provide several scenarios showcasing the capability of TopicLens using real-world datasets.
引用
收藏
页码:151 / 160
页数:10
相关论文
共 50 条
  • [1] Integrating multi-level deep learning and concept ontology for large-scale visual recognition
    Kuang, Zhenzhong
    Yu, Jun
    Li, Zongmin
    Zhang, Baopeng
    Fan, Jianping
    [J]. PATTERN RECOGNITION, 2018, 78 : 198 - 214
  • [2] RESEARCH ON EFFICIENT INDEXING OF LARGE-SCALE GEOSPATIAL DATA BASED ON MULTI-LEVEL GEOGRAPHIC GRID
    Gao, Yin
    Duo, Hairui
    Che, Jian
    Zhao, Shiquan
    Zhao, Bianli
    [J]. GEOSPATIAL WEEK 2023, VOL. 10-1, 2023, : 73 - 80
  • [3] Fast Multi-Level Connected Component Labeling for Large-scale Images
    Li, Yuhai
    [J]. 2015 INTERNATIONAL CONFERENCE ON OPTOELECTRONICS AND MICROELECTRONICS (ICOM), 2015, : 334 - 337
  • [4] A multi-level conflict resolution method for large-scale robot system
    Liu, Shuhua
    Tian, Yantao
    Lin, Heping
    [J]. WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 109 - 109
  • [5] Multi-level optimization strategies for large-scale nonlinear process systems
    Biegler, Lorenz T.
    [J]. COMPUTERS & CHEMICAL ENGINEERING, 2024, 185
  • [6] EEMC: An energy-efficient multi-level clustering algorithm for large-scale wireless sensor networks
    Jin, Yan
    Wang, Ling
    Kim, Yoohwan
    Yang, Xiaozong
    [J]. COMPUTER NETWORKS, 2008, 52 (03) : 542 - 562
  • [7] SwiftLink: Serendipitous Navigation Strategy for Large-scale Document Collections
    von Wyl, Marc
    Marchand-Maillet, Stephane
    [J]. 2012 23RD INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2012, : 83 - 87
  • [8] Visual Exploration of Large-Scale Evolving Software
    Wettel, Richard
    [J]. 2009 31ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, COMPANION VOLUME, 2009, : 391 - 394
  • [9] Visual Exploration of Large-Scale System Evolution
    Wettel, Richard
    Lanza, Michele
    [J]. FIFTEENTH WORKING CONFERENCE ON REVERSE ENGINEERING, PROCEEDINGS, 2008, : 219 - 228
  • [10] SRTM: a supervised relation topic model for multi-classification on large-scale document network
    Li, Chunshan
    Zhang, Hua
    Chu, Dianhui
    Xu, Xiaofei
    [J]. NEURAL COMPUTING & APPLICATIONS, 2020, 32 (10): : 6383 - 6392