Multidimensional analysis model for a document warehouse that includes textual measures

被引:6
|
作者
Mendoza, Martha [1 ,2 ]
Alegria, Erwin [1 ]
Maca, Manuel [1 ]
Cobos, Carlos [1 ,2 ]
Leon, Elizabeth [3 ]
机构
[1] Univ Cauca, Informat Technol Res Grp GTI, Popayan, Colombia
[2] Univ Cauca, Elect & Telecommun Engn Fac, Popayan, Colombia
[3] Univ Nacl Colombia, Fac Engn, Medellin, Antioquia, Colombia
关键词
Document warehouse; OLAP; Textual measures; Text warehouse; ALGORITHM;
D O I
10.1016/j.dss.2015.02.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data warehouses and On-Line Analytical Processing tools, OLAP, together permit a multi-dimensional analysis of structured data information. However, as business systems are increasingly required to handle substantial quantities of unstructured textual information, the need arises for an effective and similar means of analysis. To manage unstructured text data stored in data warehouses, a new multi-dimensional analysis model is proposed that includes textual measures as well as a topic hierarchy. In this model, the textual measures that associate the topics with the text documents are generated by Probabilistic Latent Semantic Analysis, while the hierarchy is created automatically using a clustering algorithm. Documents are then able to be queried using OLAP tools. The model was evaluated from two viewpoints query execution time and user satisfaction. Evaluation of execution time was carried out on scientific articles using two query types and user satisfaction (with query time and ease of use) using statistical frequency and multivariate analyses. Encouraging observations included that as the number of documents increases, query time increases as a lineal, rather than exponential tendency. In addition, the model gained an increasing acceptance with use, while the visualization of the model was also well received by users. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:44 / 59
页数:16
相关论文
共 50 条
  • [1] Multiversion Document Warehouse: An Approach to Multidimensional Analysis
    Khrouf, Kais
    Feki, Jamel
    Soule-Dupuy, Chantal
    [J]. JOURNAL OF INTELLIGENCE STUDIES IN BUSINESS, 2012, 2 (01): : 32 - 40
  • [2] An XML Document Warehouse model
    Nassis, Vicky
    Dillon, Tharam S.
    Rajagopalapillai, Rajugan
    Rahayu, Wenny
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2006, 3882 : 513 - 529
  • [3] Transform a Document Warehouse Model into NoSQL Document-Oriented Model
    Ben Messaoud, Ines
    Alzaidi, Amer
    Fattouch, Najla
    Ajala, Assia
    [J]. VISION 2020: SUSTAINABLE ECONOMIC DEVELOPMENT AND APPLICATION OF INNOVATION MANAGEMENT, 2018, : 3908 - 3920
  • [4] Enhancing the Diamond Document Warehouse Model
    Azabou, Maha
    Banjar, Ameen
    Feki, Jamel Omar
    [J]. INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2020, 16 (04) : 1 - 25
  • [5] Fuzzy spatial data warehouse: A multidimensional model
    David, Perez
    Somodevilla, Maria J.
    Pineda, No H.
    [J]. ENC 2007: EIGHTH MEXICAN INTERNATIONAL CONFERENCE ON CURRENT TRENDS IN COMPUTER SCIENCE, PROCEEDINGS, 2007, : 3 - 9
  • [6] A distributed multidimensional data model of data warehouse
    Lin, YF
    Huang, HK
    Li, HS
    [J]. ROUGH SETS, FUZZY SETS, DATA MINING, AND GRANULAR COMPUTING, 2003, 2639 : 664 - 667
  • [7] Empirical Validation of Multidimensional Model for Data Warehouse
    Mann, Suman
    Bharti
    Singh, Perminder
    [J]. 2014 3RD INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (ICRITO) (TRENDS AND FUTURE DIRECTIONS), 2014,
  • [8] AN OBJECT ORIENTED MULTIDIMENSIONAL MODEL FOR DATA WAREHOUSE
    Gosain, Anjana
    Mann, Suman
    [J]. FOURTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2011): COMPUTER VISION AND IMAGE ANALYSIS: PATTERN RECOGNITION AND BASIC TECHNOLOGIES, 2012, 8350
  • [9] Data Warehouse Designing for Vietnamese Textual Document-based Plagiarism Detection System
    Phan Hieu Ho
    Trung Hung Yo
    Ngoc Anh Thi Nguyen
    [J]. 2017 INTERNATIONAL CONFERENCE ON SYSTEM SCIENCE AND ENGINEERING (ICSSE), 2017, : 239 - 243
  • [10] Using ORACLE tools to generate Multidimensional Model in Warehouse
    Wiak, Slawomir
    Drzymala, Pawel
    Welfle, Henryk
    [J]. PRZEGLAD ELEKTROTECHNICZNY, 2012, 88 (1A): : 257 - 262