Building extensible frameworks for data processing: The case of MDP, Modular toolkit for Data Processing

被引:2
|
作者
Wilbert, Niko [1 ,2 ]
Zito, Tiziano [2 ,3 ]
Schuppner, Rike-Benjamin [2 ]
Jedrzejewski-Szmek, Zbigniew [4 ]
Wiskott, Laurenz [1 ,2 ,5 ]
Berkes, Pietro [6 ]
机构
[1] Humboldt Univ, Inst Theoret Biol, Frankfurt, Germany
[2] Bernstein Ctr Computat Neurosci, Berlin, Germany
[3] Berlin Inst Technol, Berlin, Germany
[4] Univ Warsaw, Inst Expt Phys, PL-00325 Warsaw, Poland
[5] Ruhr Univ Bochum, Inst Neuroinformat, Bochum, Germany
[6] Brandeis Univ, Natl Volen Ctr Complex Syst, Waltham, MA USA
关键词
Machine learning; !text type='Python']Python[!/text; Scientific computing; Computational neuroscience; RECOGNITION;
D O I
10.1016/j.jocs.2011.10.005
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Data processing is a ubiquitous task in scientific research, and much energy is spent on the development of appropriate algorithms. It is thus relatively easy to find software implementations of the most common methods. On the other hand, when building concrete applications, developers are often confronted with several additional chores that need to be carried out beside the individual processing steps. These include for example training and executing a sequence of several algorithms, writing code that can be executed in parallel on several processors, or producing a visual description of the application. The Modular toolkit for Data Processing (MDP) is an open source Python library that provides an implementation of several widespread algorithms and offers a unified framework to combine them to build more complex data processing architectures. In this paper we concentrate on some of the newer features of MOP, focusing on the choices made to automatize repetitive tasks for users and developers. In particular, we describe the support for parallel computing and how this is implemented via a flexible extension mechanism. We also briefly discuss the support for algorithms that require bi-directional data flow. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:345 / 351
页数:7
相关论文
共 50 条
  • [1] On the Data Stream Processing Frameworks: A Case Study
    Dhaouadi, Jasser
    Aktas, Mehmet
    2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2018, : 104 - 109
  • [2] An Extensible Parsing Pipeline for Unstructured Data Processing
    Jain, Shubham
    de Buitleir, Amy
    Fallon, Enda
    2021 23RD INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT 2021): ON-LINE SECURITY IN PANDEMIC ERA, 2021, : 312 - 318
  • [3] An Extensible Parsing Pipeline for Unstructured Data Processing
    Jain, Shubham
    de Buitleir, Amy
    Fallon, Enda
    2022 24TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): ARITIFLCIAL INTELLIGENCE TECHNOLOGIES TOWARD CYBERSECURITY, 2022, : 312 - +
  • [4] An Evaluation of Serverless Data Processing Frameworks
    Werner, Sebastian
    Girke, Richard
    Kuhlenkamp, Joern
    PROCEEDINGS OF THE 2020 SIXTH INTERNATIONAL WORKSHOP ON SERVERLESS COMPUTING (WOSC '20), 2020, : 19 - 24
  • [5] Runtime Composition for Extensible Big Data Processing Platforms
    Kimura, Kosaku
    Nomura, Yoshihide
    Tanaka, Yuka
    Kurihara, Hidetoshi
    Yamamoto, Rieko
    2015 IEEE 8TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, 2015, : 1053 - 1057
  • [6] epiC: an extensible and scalable system for processing Big Data
    Dawei Jiang
    Sai Wu
    Gang Chen
    Beng Chin Ooi
    Kian-Lee Tan
    Jun Xu
    The VLDB Journal, 2016, 25 : 3 - 26
  • [7] epiC: an Extensible and Scalable System for Processing Big Data
    Jiang, Dawei
    Chen, Gang
    Ooi, Beng Chin
    Tan, Kian-Lee
    Wu, Sai
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (07): : 541 - 552
  • [8] epiC: an extensible and scalable system for processing Big Data
    Jiang, Dawei
    Wu, Sai
    Chen, Gang
    Ooi, Beng Chin
    Tan, Kian-Lee
    Xu, Jun
    VLDB JOURNAL, 2016, 25 (01): : 3 - 26
  • [9] New modular ultrasonic signal processing building blocks for real-time data acquisition and post processing
    Weber, WH
    Mair, HD
    Jansen, D
    REVIEW OF PROGRESS IN QUANTITATIVE NONDESTRUCTIVE EVALUATION, VOLS 22A AND 22B, 2003, 20 : 616 - 619
  • [10] Provisioning Input and Output Data Rates in Data Processing Frameworks
    Nam H. Do
    Tien Van Do
    Lóránt Farkas
    Csaba Rotter
    Journal of Grid Computing, 2020, 18 : 491 - 506