Building extensible frameworks for data processing: The case of MDP, Modular toolkit for Data Processing

被引:2
|
作者
Wilbert, Niko [1 ,2 ]
Zito, Tiziano [2 ,3 ]
Schuppner, Rike-Benjamin [2 ]
Jedrzejewski-Szmek, Zbigniew [4 ]
Wiskott, Laurenz [1 ,2 ,5 ]
Berkes, Pietro [6 ]
机构
[1] Humboldt Univ, Inst Theoret Biol, Frankfurt, Germany
[2] Bernstein Ctr Computat Neurosci, Berlin, Germany
[3] Berlin Inst Technol, Berlin, Germany
[4] Univ Warsaw, Inst Expt Phys, PL-00325 Warsaw, Poland
[5] Ruhr Univ Bochum, Inst Neuroinformat, Bochum, Germany
[6] Brandeis Univ, Natl Volen Ctr Complex Syst, Waltham, MA USA
关键词
Machine learning; !text type='Python']Python[!/text; Scientific computing; Computational neuroscience; RECOGNITION;
D O I
10.1016/j.jocs.2011.10.005
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Data processing is a ubiquitous task in scientific research, and much energy is spent on the development of appropriate algorithms. It is thus relatively easy to find software implementations of the most common methods. On the other hand, when building concrete applications, developers are often confronted with several additional chores that need to be carried out beside the individual processing steps. These include for example training and executing a sequence of several algorithms, writing code that can be executed in parallel on several processors, or producing a visual description of the application. The Modular toolkit for Data Processing (MDP) is an open source Python library that provides an implementation of several widespread algorithms and offers a unified framework to combine them to build more complex data processing architectures. In this paper we concentrate on some of the newer features of MOP, focusing on the choices made to automatize repetitive tasks for users and developers. In particular, we describe the support for parallel computing and how this is implemented via a flexible extension mechanism. We also briefly discuss the support for algorithms that require bi-directional data flow. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:345 / 351
页数:7
相关论文
共 50 条
  • [21] Evaluation of distributed data processing frameworks in hybrid clouds
    Ullah, Faheem
    Dhingra, Shagun
    Xia, Xiaoyu
    Babar, M. Ali
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2024, 224
  • [22] A Survey on Big Data Processing Frameworks for Mobility Analytics
    Doulkeridis C.
    Vlachou A.
    Pelekis N.
    Theodoridis Y.
    SIGMOD Record, 2021, 50 (02): : 18 - 29
  • [23] A Survey - Data Mining Frameworks in Credit Card Processing
    Wongchinsri, Pornwatthana
    Kuratach, Werasak
    2016 13TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY (ECTI-CON), 2016,
  • [24] Survey on JVM Optimization for Big Data Processing Frameworks
    Wang, Yi-Cheng
    Zeng, Hong-Bin
    Xu, Li-Jie
    Wang, Wei
    Wei, Jun
    Huang, Tao
    Ruan Jian Xue Bao/Journal of Software, 2023, 34 (01): : 463 - 488
  • [25] Leveraging Parallel Data Processing Frameworks with Verified Lifting
    Ahmad, Maaz Bin Safeer
    Cheung, Alvin
    ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2016, (229): : 67 - 83
  • [26] Systematic Mapping for Big Data Stream Processing Frameworks
    Alayyoub, Mohammed
    Yazici, Ali
    Karakaya, Ziya
    2016 ELEVENTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT (ICDIM 2016), 2016, : 31 - 36
  • [27] A Survey on Big Data Processing Frameworks for Mobility Analytics
    Doulkeridis, Christos
    Vlachou, Akrivi
    Pelekis, Nikos
    Theodoridis, Yannis
    SIGMOD RECORD, 2021, 50 (02) : 18 - 30
  • [28] A Study on Big Data Processing Frameworks: Spark and Storm
    Deshai, N.
    Venkataramana, S.
    Sekhar, B. V. D. S.
    Srinivas, K.
    Varma, G. P. Saradhi
    SMART INTELLIGENT COMPUTING AND APPLICATIONS, VOL 2, 2020, 160 : 415 - 424
  • [29] Modular optical systems for nonlinear data processing and manipulation
    Mendlovic, D
    Zalevsky, Z
    Gur, E
    OPTICAL INFORMATION PROCESSING TECHNOLOGY, 2002, 4929 : 10 - 25
  • [30] Data paralleling and pipeline processing in the modular visualization environment
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2000, 37 (08): : 962 - 968