Inconsistency-driven approach for human-in-the-loop entity matching

被引:0
|
作者
Ito, Hiroyoshi [1 ]
Koizumi, Takahiro [2 ]
Yoshimoto, Ryuji [3 ]
Fukushima, Yukihiro [4 ]
Harada, Takashi [5 ]
Morishima, Atsuyuki [1 ]
机构
[1] Univ Tsukuba, Inst Lib Informat & Media Sci, Tsukuba, Japan
[2] Univ Tsukuba, Grad Sch Comprehens Human Sci, Tsukuba, Japan
[3] CARLIL Inc, Tokyo, Japan
[4] Keio Univ, Fac Pharm, Keio, Japan
[5] Doshisha Univ, Ctr License & Qualificat, Kyoto, Japan
关键词
D O I
10.47989/ir30iConf47140
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Introduction. Entity matching is a fundamental operation in a wide range of information management applications and a tremendous number of methods have been proposed to address the problem. Human-in-the-loop entity matching is a human-AI collaborative approach which is effective when the data for entity matching is incomplete or requires domain knowledge. A typical human-in-the- loop approach is to allow a machine-learning-based matcher to ask humans to match entities when it cannot match them with high confidence. However, ML- based matchers cannot avoid the unknown-unknown problem, i.e., they can resolve the entities incorrectly with high confidence. Method. This paper addresses an inconsistency-based method to deal with this problem. The method asks humans to resolve the entities when we find inconsistency in the transitivity property behind entity matching. For example, if a matcher returns a positive result only for two combinations among three entities, the result is inconsistent. Analysis. This paper shows an implementation of our idea in similarity-based blocking method and Bayesian inference and explains the result of an extensive set of experiments that reveals how and when the method is effective. Results. The result showed that the inconsistency-based sampling selects very different entity pairs compared to other sampling strategies and that a simple hybrid strategy performs well in many practical situations. Conclusion. The results indicate our approach complements any existing matcher that can cause the unknown-unknown problem in entity matching.
引用
收藏
页码:1024 / 1038
页数:15
相关论文
共 50 条
  • [21] Human-in-the-Loop Mixup
    Collins, Katherine M.
    Bhatt, Umang
    Liu, Weiyang
    Piratla, Vihari
    Sucholutsky, Ilia
    Love, Bradley
    Weller, Adrian
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 454 - 464
  • [22] Pattern classification driven enhancements for human-in-the-loop decision support systems
    Subramania, Halasya Siva
    Khare, Vineet R.
    DECISION SUPPORT SYSTEMS, 2011, 50 (02) : 460 - 468
  • [23] Demo Abstract: Human-in-the-loop BMS Point Matching and Metadata Labeling with Babel
    Fuerst, Jonathan
    Chen, Kaifei
    Katz, Randy H.
    Bonnet, Philippe
    BUILDSYS'15 PROCEEDINGS OF THE 2ND ACM INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS FOR ENERGY-EFFICIENT BUILT, 2015, : 101 - 102
  • [24] Smart technology-driven aspects for human-in-the-loop smart manufacturing
    Jwo, Jung-Sing
    Lin, Ching-Sheng
    Lee, Cheng-Hsiung
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2021, 114 (5-6): : 1741 - 1752
  • [25] Time and Energy Management During Approach: A Human-in-the-Loop Study
    de Jong, P. M. A.
    Bussink, F. J. L.
    Verhoeven, R. P. M.
    de Gelder, N.
    van Paassen, M. M.
    Mulder, M.
    JOURNAL OF AIRCRAFT, 2017, 54 (01): : 177 - 189
  • [26] Human-in-the-Loop for Personality Dynamics: Proposal of a New Research Approach
    Kutt, Krzysztof
    Kutt, Marzena
    Kawa, Bartosz
    Nalepa, Grzegorz J.
    ARTIFICIAL INTELLIGENCE FOR NEUROSCIENCE AND EMOTIONAL SYSTEMS, PT I, IWINAC 2024, 2024, 14674 : 455 - 464
  • [27] Human-in-the-Loop Approach Based on MRI and ECG for Healthcare Diagnosis
    Radiuk, Pavlo
    Kovalchuk, Oleksii
    Slobodzian, Vitalii
    Manziuk, Eduard
    Barmak, Oleksander
    Krak, Iurii
    5TH INTERNATIONAL CONFERENCE ON INFORMATICS & DATA-DRIVEN MEDICINE, IDDM 2022, 2022, 3302
  • [28] Human-in-the-loop Approach towards Dual Process AI Decisions
    Uchida, Hikaru
    Matsubara, Masaki
    Wakabayashi, Kei
    Morishima, Atsuyuki
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 3096 - 3098
  • [29] Towards a Human-in-the-Loop LLM Approach to Collaborative Discourse Analysis
    Cohn, Clayton
    Snyder, Caitlin
    Montenegro, Justin
    Biswas, Gautam
    ARTIFICIAL INTELLIGENCE IN EDUCATION: POSTERS AND LATE BREAKING RESULTS, WORKSHOPS AND TUTORIALS, INDUSTRY AND INNOVATION TRACKS, PRACTITIONERS, DOCTORAL CONSORTIUM AND BLUE SKY, AIED 2024, 2024, 2151 : 11 - 19
  • [30] An Integrated Approach to Human-in-the-Loop Systems and Online Social Sensing
    Fernandes, J.
    Raposo, D.
    Armando, N.
    Sinche, S.
    Sa Silva, J.
    Rodrigues, A.
    Pereira, V.
    Boavida, F.
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (IEEE INFOCOM 2019 WKSHPS), 2019, : 478 - 483