OLAP on Multidimensional Text Databases: Topic Network Cube and its Applications

被引:0
|
作者
Zhang, Zhiyuan [1 ]
Wang, Hong [1 ]
Feng, Xingjie [1 ]
机构
[1] Civil Aviat Univ China, Sch Comp Sci & Technol, Tianjin, Peoples R China
基金
中国国家自然科学基金;
关键词
multidimensional text database; topic network cube; OLAP; text mining; complex network;
D O I
10.2298/FIL1805973Z
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Multidimensional text data contains both structured attributes and unstructured text. Unlike the traditional numerical data, it is not straightforward to apply online analytical processing on multidimensional text data. Although some OLAP methods such as topic cube have been proposed in order to effectively utilize its structured information and valuable text data, these methods cant tell the relations of topic words. Considering that topics usually consist of several subtopics and each subtopic usually contains some topic words, we here use a topic network manner, in which related topic words are connected, to express the complex relations of topics. This paper introduces a new concept of topic network cube to perform OLAP analysis on multidimensional text databases. Firstly, we propose a method called GL-LDA based on Gibbs sampling outputs of Labeled LDA to measure the relations between topic words. Secondly, we give a storage model of topic network cube which can efficiently generate topic network using GL-LDA. Thirdly, we show how to perform OLAP analysis on topic network cube. Experimental results show that we can analyze multidimensional text databases in different granularity easily and effectively using just a few simple SQL statements, and the output network provides rich and useful information of topics.
引用
收藏
页码:1973 / 1982
页数:10
相关论文
共 50 条
  • [11] Text Cube: Computing IR Measures for Multidimensional Text Database Analysis
    Lin, Cindy Xide
    Ding, Bolin
    Han, Jiawei
    Zhu, Feida
    Zhao, Bo
    ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 905 - 910
  • [12] MiTexCube: MicroTextCluster Cube for Online Analysis of Text Cells and its Applications
    Zhang, Duo
    Zhai, ChengXiang
    Han, Jiawei
    STATISTICAL ANALYSIS AND DATA MINING, 2013, 6 (03) : 243 - 259
  • [13] Hydrographic applications using multidimensional databases
    McConnell, M
    Varma, H
    SEA TECHNOLOGY, 1996, 37 (03) : 17 - 19
  • [14] The study of network management architecture cube model and its applications
    Qiu, XS
    Xiong, A
    Meng, LM
    2000 INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY PROCEEDINGS, VOLS. I & II, 2000, : 1431 - 1435
  • [15] Studies on a multidimensional public opinion network model and its topic detection algorithm
    Wang, Guanghui
    Chi, Yuxue
    Liu, Yijun
    Wang, Yufei
    INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (03) : 584 - 608
  • [16] Generating text search applications for databases
    Alonso, O
    IEEE SOFTWARE, 2003, 20 (03) : 98 - +
  • [17] Generating text search applications for databases
    Alonso, Omar
    IEEE Software, 1600, 3 (98-105+4):
  • [18] Big Data Conditional Business Rule Calculations in Multidimensional In-GPU-Memory OLAP Databases
    Haberstroh, Alexander
    Strohm, Peter
    NEW TRENDS IN DATABASES AND INFORMATION SYSTEMS (ADBIS 2015), 2015, 539 : 291 - 304
  • [19] A topic detection method for network long text
    Zheng H.-Y.
    Liao C.-L.
    Li T.-Z.
    Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2019, 41 (09): : 1208 - 1214
  • [20] iNextCube: Information Network-Enhanced Text Cube
    Yu, Yintao
    Lin, Cindy X.
    Sun, Yizhou
    Chen, Chen
    Han, Jiawei
    Liao, Binbin
    Wu, Tianyi
    Zhai, ChengXiang
    Zhang, Duo
    Zhao, Bo
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2009, 2 (02): : 1622 - 1625