Data Mining from NoSQL Document-Append Style Storages

被引:1
|
作者
Lomotey, Richard K. [1 ]
Deters, Ralph [1 ]
机构
[1] Univ Saskatchewan, Dept Comp Sci, Saskatoon, SK S7N 0W0, Canada
关键词
Data mining; NoSQL; Bayesian Rule; Unstructured data; Apriori; Big Data;
D O I
10.1109/ICWS.2014.62
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The modern data economy, which has been described as "Big Data", has changed the status quo on digital content creation and storage. While data storage has followed the schema-dictated approach for decades, the recent nature of digital content, which is widely unstructured, creates the need to adopt different storage techniques. Thus, the NoSQL database systems have been proposed to accommodate most of the content being generated today. One of such NoSQL databases that have received significant enterprise adoption is the document-append style storage. The emerging concern and challenge however is that, research and tools that can aid data mining processes from such NoSQL databases is generally lacking. Even though document-append style storages allow data accessibility as Web services and over URL/I, building a corresponding data mining tool deviates from the underlying techniques governing web crawlers. Also, existing data mining tools that have been designed for schema-based storages (e.g., RDBMS) are misfits. Hence, our goal in this work is to design a unique data analytics tool that enables knowledge discovery through information retrieval from document-append style storage. The tool is algorithmically built on the inference-based Apriori, which aids us to achieve optimization of the search duration. Preliminary test results of the proposed tool also show high accuracy in comparison to other approaches that were previously proposed.
引用
收藏
页码:385 / 392
页数:8
相关论文
共 50 条
  • [31] Extraction process of conceptual model from a document-oriented NoSQL database
    Ait Brahim, Amal
    Tighilt Ferhat, Rabah
    Zurfluh, Gilles
    PROCEEDINGS OF 2019 11TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE 2019), 2019, : 391 - 395
  • [32] Correction: Extraction of Semantic Links from a Document-Oriented NoSQL Database
    Fatma Abdelhedi
    Hela Rajhi
    Gilles Zurfluh
    SN Computer Science, 4 (2)
  • [33] Music Style Recognition System Based on the Data Mining
    Bai, Xueliang
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 124 : 101 - 101
  • [34] The recognition of the architectural style using Data Mining techniques
    Mercioni, Marina Adriana
    Holban, Stefan
    2018 IEEE 12TH INTERNATIONAL SYMPOSIUM ON APPLIED COMPUTATIONAL INTELLIGENCE AND INFORMATICS (SACI), 2018, : 331 - 337
  • [35] Data mining techniques for structure of single XML document
    Mei, Dong-Xia
    Zhang, Xiao-Ming
    Shiyou Huagong Gaodeng Xuexiao Xuebao/Journal of Petrochemical Universities, 2007, 20 (01): : 94 - 98
  • [36] Relations and GUHA-style data mining II
    Hájek, P
    RELATIONAL AND KLEENE-ALGEBRAIC METHODS IN COMPUTER SCIENCE, 2003, 3051 : 163 - 170
  • [37] Application of Data Mining For Identifying Topics at the Document Level
    Reza, Marifa Farzin
    Matin, Rizwana
    2013 INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV), 2013,
  • [38] Relational data mining and ILP for document image understanding
    Ceci, Michelangelo
    Berardi, Margherita
    Malerba, Donato
    APPLIED ARTIFICIAL INTELLIGENCE, 2007, 21 (4-5) : 317 - 342
  • [39] Driving Style Analysis Using Data Mining Techniques
    Constantinescu, Z.
    Marinoiu, C.
    Vladoiu, M.
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2010, 5 (05) : 654 - 663
  • [40] Handling missing values for mining gradual patterns from NoSQL graph databases
    Shah, Faaiz
    Castelltort, Arnaud
    Laurent, Anne
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 111 : 523 - 538