A System for Unstructured Data Mining using Dynamic Ensemble Selection

被引:0
|
作者
Calado, Raquel Bezerra [1 ]
Rodriguez Torres, Leandro Sigfredo [2 ]
Maciel, Alexandre M. A. [1 ]
机构
[1] Univ Pernambuco, Recife, PE, Brazil
[2] Kurier Inteligencia Juridica, Recife, PE, Brazil
关键词
Unstructured Data; Text Mining; Dynamic Ensemble Selection; CLASSIFIER SELECTION; COMPETENCE; FRAMEWORK;
D O I
10.1109/smc42975.2020.9282967
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Unstructured data represent as much as 90% of all business-relevant information. In Brazil, the practice of printing official journals dates back to the 19th century. Today more than 200 official journals in circulation, which together accumulate around 1.4 billion publications without textual standard. This work proposes the development of a system for unstructured data mining using a Dynamic Ensemble Selection. JudEasy implements, added in addition to classic text pre-processing methods, a set of twelve DES and a static method for creating categorized textual models for Brazilian of official journals. As results the DES-KL model obtained the highest accuracy rate of 96.81% and exceptional precision of 0.99.
引用
收藏
页码:1988 / 1993
页数:6
相关论文
共 50 条
  • [41] Music Genre Classification using Dynamic Selection of Ensemble of Classifiers
    Lisboa de Almeida, Paulo Ricardo
    Britto, Alceu de Souza, Jr.
    da Silva Junior, Eunelson Jose
    Soares de Oliveira, Luis Eduardo
    Celinski, Tatiana Montes
    Koerich, Alessandro Lameiras
    PROCEEDINGS 2012 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2012, : 2700 - 2705
  • [42] Topics and Terms Mining in Unstructured Data Stores
    Lomotey, Richard K.
    Deters, Ralph
    2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 854 - 861
  • [43] Lung cancer survival prediction using ensemble data mining on SEER data
    Agrawal, Ankit
    Misra, Sanchit
    Narayanan, Ramanathan
    Polepeddi, Lalith
    Choudhary, Alok
    SCIENTIFIC PROGRAMMING, 2012, 20 (01) : 29 - 42
  • [44] Data mining exploration system for feature selection tasks
    Suraj, Zbigniew
    Delimata, Pawel
    2006 INTERNATIONAL CONFERENCE ON HYBRID INFORMATION TECHNOLOGY, VOL 1, PROCEEDINGS, 2006, : 284 - 286
  • [45] Prototype selection for dynamic classifier and ensemble selection
    Cruz, Rafael M. O.
    Sabourin, Robert
    Cavalcanti, George D. C.
    NEURAL COMPUTING & APPLICATIONS, 2018, 29 (02): : 447 - 457
  • [46] Colon cancer survival prediction using ensemble data mining on SEER data
    Al-Bahrani, Reda
    Agrawal, Ankit
    Choudhary, Alok
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [47] Prototype selection for dynamic classifier and ensemble selection
    Rafael M. O. Cruz
    Robert Sabourin
    George D. C. Cavalcanti
    Neural Computing and Applications, 2018, 29 : 447 - 457
  • [48] Knowledge Sharing Using web Mining for Categorization and Disambiguation of Structured and Unstructured Data
    da Silva, Leandro Ramos
    Omar, Nizam
    PROCEEDINGS OF THE 15TH EUROPEAN CONFERENCE ON KNOWLEDGE MANAGEMENT (ECKM 2014), VOLS 1-3, 2014, : 1265 - 1271
  • [49] Fault diagnosis on material handling system using feature selection and data mining techniques
    Demetgul, M.
    Yildiz, K.
    Taskin, S.
    Tansel, I. N.
    Yazicioglu, O.
    MEASUREMENT, 2014, 55 : 15 - 24
  • [50] Ensemble neural models for ICD code prediction using unstructured and structured healthcare data
    Merchant, Alimurtaza Mustafa
    Shenoy, Naveen
    Lanka, Sidharth
    Kamath, Sowmya
    HELIYON, 2024, 10 (17)