MiTexCube: MicroTextCluster Cube for Online Analysis of Text Cells and its Applications

被引:4
|
作者
Zhang, Duo [1 ]
Zhai, ChengXiang [1 ]
Han, Jiawei [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
MiTexCube; multidimensional text database; text mining;
D O I
10.1002/sam.11159
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A fundamental problem of multidimensional text database analysis is efficient and effective support of various kinds of online applications, such as summarizing the content of a text cell or comparing the contents across multiple text cells. In this paper, we propose a new infrastructure called MicroTextCluster Cube (or MiTexCube) to support efficient online text analysis on multidimensional text databases by introducing micro-clusters of text documents as a compact representation of text content. Experimental results on real multidimensional text databases show that (i) MiTexCube can be materialized efficiently with reasonable overhead in space, and (ii) applications based on the proposed materialized MiTexCube are more efficient than the baseline method of direct analysis based on document units in each cell, without sacrificing much quality of analysis, and MiTexCube naturally accommodates flexible trade-off between efficiency and quality of analysis. (c) 2012 Wiley Periodicals, Inc.
引用
收藏
页码:243 / 259
页数:17
相关论文
共 50 条
  • [31] Location inference for hidden population with online text analysis
    Chuchu Liu
    Ziqiang Cao
    Xin Lu
    International Journal of Health Geographics, 19
  • [32] Online Text Classification for Real Life Tweet Analysis
    Yar, Ersin
    Delibalta, Ibrahim
    Baruh, Lemi
    Kozat, Suleyman S.
    2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 1609 - 1612
  • [33] Location inference for hidden population with online text analysis
    Liu, Chuchu
    Cao, Ziqiang
    Lu, Xin
    INTERNATIONAL JOURNAL OF HEALTH GEOGRAPHICS, 2020, 19 (01)
  • [34] An evaluation of automatic text categorization in online discussion analysis
    Lui, Andrew Kwok-Fai
    Li, Siu Cheung
    Choy, Sheung On
    7TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES, PROCEEDINGS, 2007, : 205 - +
  • [35] Semisupervised sentiment analysis method for online text reviews
    Lee, Gyeong Taek
    Kim, Chang Ouk
    Song, Min
    JOURNAL OF INFORMATION SCIENCE, 2021, 47 (03) : 387 - 403
  • [36] Efficient kernel generation based on implicit cube set representations and its applications
    Sawada, H
    Yamashita, S
    Nagoya, A
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2000, E83A (12) : 2513 - 2519
  • [37] Incremental commute time and its online applications
    Nguyen Lu Dang Khoa
    Wang, Yang
    Chawla, Sanjay
    PATTERN RECOGNITION, 2019, 88 : 101 - 112
  • [38] Online measurement of optical power and its applications
    Wang, Qian
    Liu, Guowei
    Cao, Gengxin
    Zhao, Qingchun
    Dong, Yunfei
    Dianli Xitong Zidonghua/Automation of Electric Power Systems, 2009, 33 (02): : 75 - 77
  • [39] Online handwriting recognition technology and its applications
    Tanaka, H
    Iwayama, N
    Akiyama, K
    FUJITSU SCIENTIFIC & TECHNICAL JOURNAL, 2004, 40 (01): : 170 - 178
  • [40] Text mining and its potential applications in systems biology
    Ananiadou, Sophia
    Kell, Douglas B.
    Tsujii, Jun-ichi
    TRENDS IN BIOTECHNOLOGY, 2006, 24 (12) : 571 - 579