A method for similarity-based grouping of biological data

被引:0
|
作者
Jakoniene, Vaida [1 ]
Rundqvist, David [1 ]
Lambrix, Patrick [1 ]
机构
[1] Linkoping Univ, Dept Comp & Informat Sci, SE-58183 Linkoping, Sweden
关键词
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Similarity-based grouping of data entries in one or more data sources is a task underlying many different data management tasks, such as, structuring search results, removal of redundancy in databases and data integration. Similarity-based grouping of data entries is not a trivial task in the context of life science data sources as the stored data is complex, highly correlated and represented at different levels of granularity. The contribution of this paper is two-fold. 1) We propose a method for similarity-based grouping and 2) we show results from test cases. As the main steps the method contains specification of grouping rules, pairwise grouping between entries, actual grouping of similar entries, and evaluation and analysis of the results. Often, different strategies can be used in the different steps. The method enables exploration of the influence of the choices and supports evaluation of the results with respect to given classifications. The grouping method is illustrated by test cases based on different strategies and classifications. The results show the complexity of the similarity-based grouping tasks and give deeper insights in the selected grouping tasks, the analyzed data source, and the influence of different strategies on the results.
引用
收藏
页码:136 / 151
页数:16
相关论文
共 50 条
  • [41] Visually exploring movement data via similarity-based analysis
    Nikos Pelekis
    Gennady Andrienko
    Natalia Andrienko
    Ioannis Kopanakis
    Gerasimos Marketos
    Yannis Theodoridis
    [J]. Journal of Intelligent Information Systems, 2012, 38 : 343 - 391
  • [42] Unsupervised Similarity-based Sensor Selection for Time Series Data
    Almarri, Badar
    Rajasekaran, Sanguthevar
    Huang, Chun-Hsi
    [J]. 2019 IEEE 10TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON), 2019, : 395 - 400
  • [43] A Similarity-Based Disease Diagnosis System for Medical Big Data
    Yuan, Youwei
    Chen, Weixin
    Yan, Lamei
    Huang, Binbin
    Li, Jianyuan
    [J]. JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2017, 7 (02) : 364 - 370
  • [44] Similarity-Based Analytics for Trajectory Data: Theory, Algorithms and Applications
    Zheng, Kai
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2014, PT II, 2014, 8422 : 549 - 550
  • [45] Secure similarity-based cloud data deduplication in Ubiquitous city
    Liu, Jinfeng
    Wang, Jianfeng
    Tao, Xiaoling
    Shen, Jian
    [J]. PERVASIVE AND MOBILE COMPUTING, 2017, 41 : 231 - 242
  • [46] Random Similarity-Based Entropy/Alpha Classification of PolSAR Data
    Li, Dong
    Zhang, Yunhua
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2017, 10 (12) : 5712 - 5723
  • [47] Incremental Matrix Reordering for Similarity-Based Dynamic Data Sets
    Rastin, Parisa
    Matei, Basarab
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2017, PT V, 2017, 10638 : 76 - 84
  • [48] Visually exploring movement data via similarity-based analysis
    Pelekis, Nikos
    Andrienko, Gennady
    Andrienko, Natalia
    Kopanakis, Ioannis
    Marketos, Gerasimos
    Theodoridis, Yannis
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2012, 38 (02) : 343 - 391
  • [49] A Similarity-Based Software Recommendation Method Reflecting User Requirements
    Baek, Se In
    Song, Yang-Eui
    Lee, Yong Kyu
    [J]. INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS, 2020, 20 (03) : 201 - 210
  • [50] FAILURE PROGNOSTICS BY A DATA-DRIVEN SIMILARITY-BASED APPROACH
    Di Maio, Francesco
    Zio, Enrico
    [J]. INTERNATIONAL JOURNAL OF RELIABILITY QUALITY & SAFETY ENGINEERING, 2013, 20 (01):