Improved scalability in mining using ontology record linkage algorithm

被引:0
|
作者
Prabhu, T. [1 ]
Dhas, C. Suresh Gnana [2 ]
机构
[1] Manonmaniam Sundaranar Univ, Dept Comp Sci & Engn, Thirunelveli 627012, Tamil Nadu, India
[2] Vivekanadha Coll Engn Women, Dept Comp Sci & Engn, Tiruchengode 637205, Tamil Nadu, India
关键词
Record linkage; Data mining; Angle based neighborhood; Ontology; Conventional method; INJURIES;
D O I
10.1016/j.compeleceng.2018.01.026
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Record linkage offers wide role in record identification and relevant datasets matching. The conventional researchers use probabilistic approach to identify reliable and unique datasets. Record linkage with probabilistic approach exploits data, which are common to an individual record pair. Classical methods have equality based record linkage in common fields. Therefore, errors associated with record linkage reduce the scalability. In this paper, a similarity between individual values of record pairs is improved using ontology-based semantic similarity model. Semantic similarity between the records is tested successfully using angle based neighborhood graph. To validate the proposed approach, a conventional record linkage algorithm is compared with angle based neighborhood ontology record linkage technique, which achieves improved accuracy and scalability. Finally, the accuracy of identifying similar semantic matches is more scalable in proposed technique than conventional methods. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:511 / 519
页数:9
相关论文
共 50 条
  • [21] Metric-based data mining model for genealogical record linkage
    Ivie, Stephen
    Pixton, Burdette
    Giraud-Carrier, Christophe
    [J]. IRI 2007: PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2007, : 538 - +
  • [22] An intelligent search engine using improved web hyperlink mining algorithm
    Zhang, L
    Ma, FY
    Ye, YM
    [J]. INTERNET CHALLENGE: TECHNOLOGY AND APPLICATIONS, 2002, : 11 - 18
  • [23] Packet Signature Mining for Application Identification Using an Improved Apriori Algorithm
    Tao, Linhui
    Liu, Guangjie
    Liu, Weiwei
    Dai, Yuewei
    [J]. PROCEEDINGS OF 2015 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATCS AND COMPUTING (IEEE PIC), 2015, : 633 - 637
  • [24] An improved algorithm for mining class association rules using the difference of Obidsets
    Nguyen, Loan T. T.
    Ngoc Thanh Nguyen
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (09) : 4361 - 4369
  • [25] An improved Usage Mining using Back Propagation Algorithm With Functional Update
    Santhi, S.
    Srinivasan, Purushothaman
    [J]. 2009 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE, VOLS 1-3, 2009, : 1465 - +
  • [26] An improved algorithm for mining association rules
    Zhang, X.
    He, Y.
    Wan, J.
    Zhao, H.
    [J]. Dongbei Daxue Xuebao/Journal of Northeastern University, 2001, 22 (04): : 401 - 404
  • [27] AN ALGORITHM OF IMPROVED ASSOCIATION RULES MINING
    Fang, Gang
    Wei, Zu-Kuan
    Liu, Yu-Lu
    [J]. PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 133 - +
  • [28] An Improved Algorithm for Frequent Itemsets Mining
    Jiang, Hao
    He, Xu
    [J]. 2017 FIFTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2017, : 314 - 317
  • [29] An Empiric Modification to the Probabilistic Record Linkage Algorithm Using Frequency-Based Weight Scaling
    Zhu, Vivienne J.
    Overhage, Marc J.
    Egg, James
    Downs, Stephen M.
    Grannis, Shaun J.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2009, 16 (05) : 738 - 745
  • [30] An improved parallel algorithm for sequence mining
    She, Chundong
    Tang, Jian
    Li, Lei
    Wang, Hongbing
    Fan, Zhihua
    [J]. 2005 IEEE International Conference on Mechatronics and Automations, Vols 1-4, Conference Proceedings, 2005, : 1692 - 1696