Similarity-based data reduction techniques

被引：0

作者：

Guo, G ^{[1
]}

Wang, H

Bell, D

机构：

[1] Univ Ulster, Sch Comp & Math, Coleraine BT37 0QB, Londonderry, North Ireland

[2] Univ Bradford, Dept Comp, Bradford BD7 1DP, W Yorkshire, England

[3] Queens Univ Belfast, Sch Comp Sci, Belfast BT7 1NN, Antrim, North Ireland

来源：

JOURNAL OF RESEARCH AND PRACTICE IN INFORMATION TECHNOLOGY | 2005年 / 37卷 / 02期

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The k-nearest neighbours (kNN) is a simple but effective method for classification. Its major drawbacks are (1) low efficiency, and (2) dependency on the selection of a "good value" for k. In this paper, we propose a novel similarity-based data reduction method (SBModel) together with three variants aimed at overcoming these shortcomings. Our method constructs a similarity-based model for the data, which replaces the data to serve as the basis of classification. The value of k is automatically determined, is varied in terms of local data distribution, and is optimal in terms of classification accuracy. The construction of the model significantly reduces the amount of data needed for classification, thus making classification faster. Experiments conducted on some public data sets show that SBModel and its variants compare well with C5.0, kNN, wkNN, and other data reduction methods in both efficiency and effectiveness.

引用

页码：211 / 232

页数：22

共 50 条

[21] Similarity-Based Fast Analysis of Data Center Networks
Narayana, Shruti Yadav
Shriver, Emily
O'Neal, Kenneth
Yildirim, Nuriye
Begaliyeva, Khamida
Ogras, Umit Y.
IEEE DESIGN & TEST, 2023, 40 (06) : 100 - 111
[22] Data integration by fuzzy similarity-based hierarchical clustering
Ciaramella, Angelo
Nardone, Davide
Staiano, Antonino
BMC BIOINFORMATICS, 2020, 21 (Suppl 10)
[23] Similarity-based unification
Formato, Ferrante
Gerla, Giangiacomo
Sessa, Maria I.
Fundamenta Informaticae, 2000, 41 (04) : 393 - 414
[24] On Similarity-Based Unfolding
Moreno, Gines
Penabad, Jaime
Antonio Riaza, Jose
SCALABLE UNCERTAINTY MANAGEMENT (SUM 2017), 2017, 10564 : 420 - 426
[25] Derivation digraphs for dependencies in ordinal and similarity-based data
Urbanova, Lucie
Vychodil, Vilem
INFORMATION SCIENCES, 2014, 268 : 381 - 396
[26] Data integration by fuzzy similarity-based hierarchical clustering
Angelo Ciaramella
Davide Nardone
Antonino Staiano
BMC Bioinformatics, 21
[27] A similarity-based data warehousing environment for medical images
Teixeira, Jefferson William
Annibal, Luana Peixoto
Felipe, Joaquim Cezar
Ciferri, Ricardo Rodrigues
de Aguiar Ciferri, Cristina Dutra
COMPUTERS IN BIOLOGY AND MEDICINE, 2015, 66 : 190 - 208
[28] Similarity-based second chance autoencoders for textual data
Goudarzvand, Saria
Gharibi, Gharib
Lee, Yugyung
APPLIED INTELLIGENCE, 2022, 52 (11) : 12330 - 12346
[29] PySEF: A python']python library for similarity-based dimensionality reduction
Passalis, Nikolaos
Tefas, Anastasios
KNOWLEDGE-BASED SYSTEMS, 2018, 152 : 186 - 187
[30] Similarity-based Fisherfaces
Delgado-Gomez, David
Fagertun, Jens
Ersboll, Bjarne
Sukno, Federico M.
Frangi, Alejandro F.
PATTERN RECOGNITION LETTERS, 2009, 30 (12) : 1110 - 1116

← 1 2 3 4 5 →