Building extensible frameworks for data processing: The case of MDP, Modular toolkit for Data Processing

被引:2
|
作者
Wilbert, Niko [1 ,2 ]
Zito, Tiziano [2 ,3 ]
Schuppner, Rike-Benjamin [2 ]
Jedrzejewski-Szmek, Zbigniew [4 ]
Wiskott, Laurenz [1 ,2 ,5 ]
Berkes, Pietro [6 ]
机构
[1] Humboldt Univ, Inst Theoret Biol, Frankfurt, Germany
[2] Bernstein Ctr Computat Neurosci, Berlin, Germany
[3] Berlin Inst Technol, Berlin, Germany
[4] Univ Warsaw, Inst Expt Phys, PL-00325 Warsaw, Poland
[5] Ruhr Univ Bochum, Inst Neuroinformat, Bochum, Germany
[6] Brandeis Univ, Natl Volen Ctr Complex Syst, Waltham, MA USA
关键词
Machine learning; !text type='Python']Python[!/text; Scientific computing; Computational neuroscience; RECOGNITION;
D O I
10.1016/j.jocs.2011.10.005
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Data processing is a ubiquitous task in scientific research, and much energy is spent on the development of appropriate algorithms. It is thus relatively easy to find software implementations of the most common methods. On the other hand, when building concrete applications, developers are often confronted with several additional chores that need to be carried out beside the individual processing steps. These include for example training and executing a sequence of several algorithms, writing code that can be executed in parallel on several processors, or producing a visual description of the application. The Modular toolkit for Data Processing (MDP) is an open source Python library that provides an implementation of several widespread algorithms and offers a unified framework to combine them to build more complex data processing architectures. In this paper we concentrate on some of the newer features of MOP, focusing on the choices made to automatize repetitive tasks for users and developers. In particular, we describe the support for parallel computing and how this is implemented via a flexible extension mechanism. We also briefly discuss the support for algorithms that require bi-directional data flow. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:345 / 351
页数:7
相关论文
共 50 条
  • [41] Nanopype: a modular and scalable nanopore data processing pipeline
    Giesselmann, Pay
    Hetzel, Sara
    Mueller, Franz-Josef
    Meissner, Alexander
    Kretzmer, Helene
    BIOINFORMATICS, 2019, 35 (22) : 4770 - 4772
  • [42] PhytoOracle: Scalable, modular phenomics data processing pipelines
    Gonzalez, Emmanuel M.
    Zarei, Ariyan
    Hendler, Nathanial
    Simmons, Travis
    Zarei, Arman
    Demieville, Jeffrey
    Strand, Robert
    Rozzi, Bruno
    Calleja, Sebastian
    Ellingson, Holly
    Cosi, Michele
    Davey, Sean
    Lavelle, Dean O.
    Truco, Maria Jose
    Swetnam, Tyson L.
    Merchant, Nirav
    Michelmore, Richard W.
    Lyons, Eric
    Pauli, Duke
    FRONTIERS IN PLANT SCIENCE, 2023, 14
  • [43] Building accessible content processing frameworks
    Crombie, D
    Lenoir, R
    McKenzie, N
    Ioannidis, G
    Sixteenth International Workshop on Database and Expert Systems Applications, Proceedings, 2005, : 876 - 880
  • [44] Towards building a data-intensive index for big data computing - A case study of Remote Sensing data processing
    Ma, Yan
    Wang, Lizhe
    Liu, Peng
    Ranjan, Rajiv
    INFORMATION SCIENCES, 2015, 319 : 171 - 188
  • [45] LwTool: A data processing toolkit for building a real-time pressure mapping smart textile software system
    Guo, Tao
    Huang, Zhixin
    Cheng, Jingyuan
    PERVASIVE AND MOBILE COMPUTING, 2022, 80
  • [46] DaqProVis, a toolkit for acquisition, interactive analysis, processing and visualization of multidimensional data
    Morhác, M
    Matousek, V
    Turzo, I
    Khman, I
    NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 2006, 559 (01): : 76 - 80
  • [47] Data Readout and Processing Toolkit for Small-Size Gamma Cameras
    Popov, Vladimir
    Degtiarenko, Pavel
    Musatov, Igor
    Williams, Mark
    2006 IEEE NUCLEAR SCIENCE SYMPOSIUM CONFERENCE RECORD, VOL 1-6, 2006, : 2929 - 2932
  • [48] Pyradi: an open-source toolkit for infrared calculation and data processing
    Willers, Cornelius J.
    Willers, Maria S.
    Santos, Ricardo Augusto T.
    van der Merwe, Petrus J.
    Calitz, Johannes J.
    de Waal, Alta
    Mudau, Azwitamisi E.
    TECHNOLOGIES FOR OPTICAL COUNTERMEASURES IX, 2012, 8543
  • [49] BioKIT: a versatile toolkit for processing and analyzing diverse types of sequence data
    Steenwyk, Jacob L.
    Buida, Thomas J., III
    Goncalves, Carla
    Goltz, Dayna C.
    Morales, Grace
    Mead, Matthew E.
    LaBella, Abigail L.
    Chavez, Christina M.
    Schmitz, Jonathan E.
    Hadjifrangiskou, Maria
    Li, Yuanning
    Rokas, Antonis
    GENETICS, 2022, 221 (03)
  • [50] Review of Big Data and Processing Frameworks for Disaster Response Applications
    Cumbane, Silvino Pedro
    Gidofalvi, Gyozo
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2019, 8 (09)