Scalable embedding of multiple perspectives for indefinite life-science data analysis

被引:1
|
作者
Munch, Maximilian [1 ]
Heilig, Simon [2 ]
Vath, Philipp [2 ]
Schleif, Frank-Michael [2 ]
机构
[1] Univ Groningen, Bernoulli Inst Math Comp Sci & Artificial Intelli, Groningen, Netherlands
[2] Univ Appl Sci Wurzburg Schweinfurt, Dept Comp Sci & Business Informat Syst, Wurzburg, Germany
关键词
Indefinite learning; complex-valued embedding; life science data; multi-perspective embedding; multimodal data;
D O I
10.1109/SSCI50451.2021.9659914
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Life science data analysis frequently encounters particular challenges that cannot be solved with classical techniques from data analytics or machine learning domains. The complex inherent structure of the data and especially the encoding in non-standard ways, e.g., as genome- or protein-sequences, graph structure or histograms, often limit the development of appropriate classification models. To address these limitations, the application of domain-specific expert similarity measures has gained a lot of attention in the past. However, the use of such expert measures suffers from two major drawbacks: (a) there is not one outstanding similarity measure that guarantees success in all application scenarios, and (b) such similarity functions often lead to indefinite data that cannot be processed by classical machine learning methods. In order to tackle both of these limitations, this paper presents a method to embed indefinite life science data with various similarity measures at the same time into a complex-valued vector space. We test our approach on various life science data sets and evaluate the performance against other competitive methods to show its efficiency.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Improvisation. Culture and life-science Perspectives
    Krauthausen, Karin
    [J]. ROMANISCHE FORSCHUNGEN, 2011, 123 (02) : 279 - 283
  • [2] Scalable Topological Data Analysis for Life Science Applications Invited Talk
    Kalyanaraman, Ananth
    [J]. PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS 2021 (CF 2021), 2021, : 208 - 208
  • [4] MACVECTOR - SEQUENCE-ANALYSIS SOFTWARE - VERSION-4.1 - LIFE-SCIENCE PRODUCTS
    MEYER, A
    [J]. QUARTERLY REVIEW OF BIOLOGY, 1995, 70 (01): : 128 - 129
  • [5] VISION ASSISTED ROBOTICS AND TAPE TECHNOLOGY IN THE LIFE-SCIENCE LABORATORY - APPLICATIONS TO GENOME ANALYSIS
    MARTIN, WJ
    WALMSLEY, RM
    [J]. BIO-TECHNOLOGY, 1990, 8 (12): : 1258 - 1262
  • [6] Scalable Programming and Algorithms for Data-Intensive Life Science Applications
    Qiu, Judy
    [J]. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, 2011, 15 (04) : 235 - 237
  • [7] A PRACTICAL GUIDE FOR INFORMATION-RETRIEVAL ON BIOLOGY AND LIFE-SCIENCE .1. BIOSIS(1) DATA ELEMENT
    YAMANAKA, K
    [J]. JOURNAL OF THE AGRICULTURAL CHEMICAL SOCIETY OF JAPAN, 1982, 56 (06): : 489 - 494
  • [8] Genetically modified animals from life-science, socio-economic and ethical perspectives: examining issues in an EU policy context
    Frewer, L. J.
    Kleter, G. A.
    Brennan, M.
    Coles, D.
    Fischer, A. R. H.
    Houdebine, L. M.
    Mora, C.
    Millar, K.
    Salter, B.
    [J]. NEW BIOTECHNOLOGY, 2013, 30 (05) : 447 - 460
  • [9] Creating the world's largest reconfigurable supercomputing system based on the scalable SGI® Altix® 4700 system infrastructure and benchmarking life-science applications
    Cofer, Haruna
    Fouquet-Lapar, Matthias
    Gamerdinger, Timothy
    Lindahl, Christopher
    Losure, Bruce
    Mayer, Alan
    Swoboda, James
    Utsumi, Teruo
    [J]. RECONFIGURABLE COMPUTING: ARCHITECTURES, TOOLS AND APPLICATIONS, 2008, 4943 : 268 - 273
  • [10] Scalable transcriptomics analysis with Dask: applications in data science and machine learning
    Marta Moreno
    Ricardo Vilaça
    Pedro G. Ferreira
    [J]. BMC Bioinformatics, 23