A dictionary-based approach to fast and accurate name matching in large law enforcement databases

被引:0
|
作者
Kursun, Olcay [1 ]
Koufakou, Anna
Chen, Bing
Georgiopoulos, Michael
Reynolds, Kenneth M.
Eaglin, Ron
机构
[1] Univ Cent Florida, Dept Engn Technol, Orlando, FL 32816 USA
[2] Univ Cent Florida, Sch Elect Engn & Comp Sci, Orlando, FL 32816 USA
[3] Univ Cent Florida, Dept Criminal Justice & Legal Studies, Orlando, FL 32816 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the presence of dirty data, a search for specific information by a standard query (e.g., search for a name that is misspelled or mistyped) does not return all needed information. This is an issue of grave importance in homeland security, criminology, medical applications, GIS (geographic information systems) and so on. Different techniques, such as soundex, phonix, n-grams, edit-distance, have been used to improve the matching rate in these name-matching applications. There is a pressing need for name matching approaches that provide high levels of accuracy, while at the same time maintaining the computational complexity of achieving this goal reasonably low. In this paper, we present ANSWER, a name matching approach that utilizes a prefix-tree of available names in the database. Creating and searching the name dictionary tree is fast and accurate and, thus, ANSWER is superior to other techniques of retrieving fuzzy name matches in large databases.
引用
收藏
页码:72 / 82
页数:11
相关论文
共 50 条
  • [1] FAST DICTIONARY-BASED APPROACH FOR MASS SPECTROMETRY DATA ANALYSIS
    Afef, Cherni
    Emilie, Chouzenoux
    Marc-Andre, Delsuc
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 816 - 820
  • [2] A dictionary-based compressed pattern matching algorithm
    Ho, MH
    Yen, HC
    [J]. 26TH ANNUAL INTERNATIONAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE, PROCEEDINGS, 2002, : 873 - 878
  • [3] THE STATISTICAL DICTIONARY-BASED STRING MATCHING PROBLEM
    Suri, M.
    Rini, S.
    [J]. IRAN WORKSHOP ON COMMUNICATION AND INFORMATION THEORY (IWCIT 2019), 2019,
  • [4] Dictionary-based fast transform for text compression
    Sun, WF
    Zhang, N
    Mukherjee, A
    [J]. ITCC 2003: INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: COMPUTERS AND COMMUNICATIONS, PROCEEDINGS, 2003, : 176 - 182
  • [5] Matching images of never-before-seen individuals in large law enforcement databases
    Kursun, Olcay
    Reynolds, Kenneth M.
    Favorov, Oleg
    [J]. INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2006, 3975 : 766 - 767
  • [6] ANSWER:: Approximate name search with errors in large databases by a novel approach based on prefix-dictionary
    Kursun, Olcay
    Koufakou, Anna
    Wakchaure, Abhijit
    Georgiopoulos, Michael
    Reynolds, Kenneth
    Eaglin, Ronald
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2006, 15 (05) : 839 - 848
  • [7] Fast Dictionary-Based Compression for Inverted Indexes
    Pibiri, Giulio Ermanno
    Petri, Matthias
    Moffat, Alistair
    [J]. PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, : 6 - 14
  • [8] A dictionary-based approach for gene annotation
    Pachter, L
    Batzoglou, S
    Spitkovsky, VI
    Banks, E
    Lander, ES
    Kleitman, DJ
    Berger, B
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 1999, 6 (3-4) : 419 - 430
  • [9] Improving the performance of dictionary-based approaches in protein name recognition
    Tsuruoka, Y
    Tsujii, J
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2004, 37 (06) : 461 - 470
  • [10] Fast and Robust Dictionary-based Classification for Image Data
    Zeng, Shaoning
    Zhang, Bob
    Gou, Jianping
    Xu, Yong
    Huang, Wei
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2021, 15 (06)