Study of an Improved Text Filter Algorithm Based on Trie Tree

被引:0
|
作者
Yang, Wenchuan [1 ]
Fang, Zeyang
Hui, Lei [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Comp, Beijing 100876, Peoples R China
关键词
Genetic algorithm; Fitness function; Atypical incident; K-means method; Data Mining;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
There are some atypical texts hidden in the Telecom's customer complaint text. These atypical complaints can be divided into several classes. Atypical items have high confidence with intergroup neighbor, but low support in full complaint set. After filtering out the high-frequency items, we can use k-means method to clustering the complaint texts. However, the clustering result is affected by the random choosing of the original K centers and it is not accurate to extract the atypical complaint classes. This paper will propose a genetic algorithm optimized k-means method, and design a fitness function. It helps to choose the global optimum K centers for k-means method, and make the result most accurate. The improved model is more suitable for small memory systems, and it has better performance in security and dynamic adaptation. This improved model has good application value
引用
收藏
页码:594 / 597
页数:4
相关论文
共 50 条
  • [1] Study for the Double-array Trie Tree Based Algorithm in Word Segmentation
    Yang, Wenchuan
    Fang, Zeyang
    Li, Pengfei
    [J]. INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENVIRONMENTAL ENGINEERING (CSEE 2015), 2015, : 440 - 446
  • [2] An Improved Bayesian TRIE Based Model for SMS Text Normalization
    Sikdar, Abhinava
    Chatterjee, Niladri
    [J]. INTELLIGENT COMPUTING, VOL 2, 2022, 507 : 579 - 593
  • [3] Research of an Improved Algorithm for Chinese Word Segmentation Dictionary Based on Double-Array Trie Tree
    Yang, Wenchuan
    Liu, Jian
    Yu, Miao
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2013, 2013, 400 : 355 - 362
  • [4] The Adaptive Spelling Error Checking Algorithm based on Trie Tree
    Xu, Yongbing
    Wang, Junyi
    [J]. PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN ENERGY, ENVIRONMENT AND CHEMICAL ENGINEERING (AEECE 2016), 2016, 89 : 299 - 302
  • [5] An Improved Text Retrieval Algorithm Based on Suffix Tree Similarity Measure
    Huang, Cheng-hui
    Yin, Jian
    Han, Dong
    [J]. INFORMATION COMPUTING AND APPLICATIONS, PT 2, 2010, 106 : 150 - +
  • [6] Spatial-text tree clustering algorithm based on improved NSGA-Ⅲ
    Ma W.
    Wang R.
    Wu Y.
    Deng S.
    [J]. Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2020, 48 (05): : 86 - 92
  • [7] Research on IP classification algorithm based on multibit-trie tree
    Shang, Fengjun
    [J]. DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2006, 13 : 748 - 752
  • [8] ITOC: An Improved Trie-Based Algorithm for Online Packet Classification
    Li, Yifei
    Wang, Jinlin
    Chen, Xiao
    Wu, Jinghong
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (18):
  • [9] A Fast Algorithm for Attribute Reduction Based on Trie Tree and Rough Set Theory
    Hu Feng
    Wang Xiao-yan
    Luo Chuan-jiang
    [J]. FIFTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2012): ALGORITHMS, PATTERN RECOGNITION AND BASIC TECHNOLOGIES, 2013, 8784
  • [10] Organization method of large-scale domain dictionary based on improved TRIE tree
    Wang, Jingdong
    Song, Jianlei
    Kan, Haitao
    Meng, Fanqi
    Li, Jia
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 125 : 116 - 117