MiTexCube: MicroTextCluster Cube for Online Analysis of Text Cells and its Applications

被引:4
|
作者
Zhang, Duo [1 ]
Zhai, ChengXiang [1 ]
Han, Jiawei [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
MiTexCube; multidimensional text database; text mining;
D O I
10.1002/sam.11159
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A fundamental problem of multidimensional text database analysis is efficient and effective support of various kinds of online applications, such as summarizing the content of a text cell or comparing the contents across multiple text cells. In this paper, we propose a new infrastructure called MicroTextCluster Cube (or MiTexCube) to support efficient online text analysis on multidimensional text databases by introducing micro-clusters of text documents as a compact representation of text content. Experimental results on real multidimensional text databases show that (i) MiTexCube can be materialized efficiently with reasonable overhead in space, and (ii) applications based on the proposed materialized MiTexCube are more efficient than the baseline method of direct analysis based on document units in each cell, without sacrificing much quality of analysis, and MiTexCube naturally accommodates flexible trade-off between efficiency and quality of analysis. (c) 2012 Wiley Periodicals, Inc.
引用
收藏
页码:243 / 259
页数:17
相关论文
共 50 条
  • [41] Biomedical text mining and its applications in cancer research
    Zhu, Fei
    Patumcharoenpol, Preecha
    Zhang, Cheng
    Yang, Yang
    Chan, Jonathan
    Meechai, Asawin
    Vongsangnak, Wanwipa
    Shen, Bairong
    JOURNAL OF BIOMEDICAL INFORMATICS, 2013, 46 (02) : 200 - 211
  • [42] Independent Knowledge Extraction in Nature of Humorous Text Analysis Review Using Online Text Analysis Tool
    Robin, J. Emmanual
    Krishnamoorthy, N.
    Karthikeyan, M.
    Felix, A. John
    2014 INTERNATIONAL CONFERENCE ON COMMUNICATION AND NETWORK TECHNOLOGIES (ICCNT), 2014, : 162 - 164
  • [43] Dynamic sampling of text streams and its application in text analysis
    Gang Tian
    Jiajia Huang
    Min Peng
    Jiahui Zhu
    Yanchun Zhang
    Knowledge and Information Systems, 2017, 53 : 507 - 531
  • [44] RETRACTED: Qualitative Analysis of Text Summarization Techniques and Its Applications in Health Domain (Retracted Article)
    Yadav, Divakar
    Lalit, Naman
    Kaushik, Riya
    Singh, Yogendra
    Yadav, Arun Kr.
    Bhadane, Kishor V.
    Kumar, Adarsh
    Khan, Baseem
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [45] Manga Text Detection with Manga-Specific Data Augmentation and Its Applications on Emotion Analysis
    Yang, Yi-Ting
    Chu, Wei-Ta
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2023, 13834 LNCS : 29 - 40
  • [46] Dynamic sampling of text streams and its application in text analysis
    Tian, Gang
    Huang, Jiajia
    Peng, Min
    Zhu, Jiahui
    Zhang, Yanchun
    KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 53 (02) : 507 - 531
  • [47] Manga Text Detection with Manga-Specific Data Augmentation and Its Applications on Emotion Analysis
    Yang, Yi-Ting
    Chu, Wei-Ta
    MULTIMEDIA MODELING, MMM 2023, PT II, 2023, 13834 : 29 - 40
  • [48] RAINFALL INTERVENTION ANALYSIS FOR ONLINE APPLICATIONS
    SASTRI, T
    VALDES, JB
    JOURNAL OF WATER RESOURCES PLANNING AND MANAGEMENT-ASCE, 1989, 115 (04): : 397 - 415
  • [49] Online Component Analysis, Architectures and Applications
    Souza Filho, Joao B. O.
    Van, Lan-Da
    Jung, Tzyy-Ping
    Diniz, Paulo S. R.
    FOUNDATIONS AND TRENDS IN SIGNAL PROCESSING, 2022, 16 (3-4): : 224 - 429
  • [50] Review of Methods and Applications of Text Sentiment Analysis
    Jiawa Z.
    Wei L.
    Sili W.
    Heng Y.
    Data Analysis and Knowledge Discovery, 2021, 54 (06) : 1 - 13