Towards data warehousing and mining of protein unfolding simulation data

被引:16
|
作者
Berrar D. [1 ,2 ]
Stahl F. [1 ,2 ,3 ]
Silva C. [4 ]
Rodrigues J.R. [4 ]
Brito R.M.M. [4 ]
Dubitzky W. [1 ,2 ]
机构
[1] School of Biomedical Sciences, University of Ulster, Coleraine BT52 1SA, Northern Ireland Cromore Road
[2] School of Biomedical Sciences, University of Ulster, Coleraine BT52 1SA, Northern Ireland Cromore Road
[3] Weihenstephan University of Applied Sciences, Freising
[4] Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade de Coimbra, Coimbra
关键词
Data mining; Data warehousing; Grid; Molecular dynamics simulation; Protein unfolding; Transthyretin;
D O I
10.1007/s10877-005-0676-z
中图分类号
学科分类号
摘要
Objectives. The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data. Methods. To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis. Results. To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse. Conclusions. Web and grid services, especially pre-defined data mining services that can run on or 'near' the data repository of the data warehouse, are likely to play a pivotal role in the anal ysis of molecular dynamics unfolding data. © Springer Science + Business Media, Inc. 2005.
引用
收藏
页码:307 / 317
页数:10
相关论文
共 50 条
  • [41] Towards a framework for evaluating investments in data warehousing
    Counihan, A
    Finnegan, P
    Sammon, D
    INFORMATION SYSTEMS JOURNAL, 2002, 12 (04) : 321 - 338
  • [42] Sting_RDB: a relational database of structural parameters for protein analysis with support for data warehousing and data mining
    Oliveira, S. R. M.
    Almeida, G. V.
    Souza, K. R. R.
    Rodrigues, D. N.
    Kuser-Falcao, P. R.
    Yamagishi, M. E. B.
    Santos, E. H.
    Vieira, F. D.
    Jardine, J. G.
    Neshich, G.
    GENETICS AND MOLECULAR RESEARCH, 2007, 6 (04): : 911 - 922
  • [44] Novel approaches to the application of chemical information with data mining and chemical data warehousing
    Parish, Edward
    Lee, Shwn-Meei
    Huang, Wan-Yuan
    Honda, Hiroshi
    Wei, Tsao-Yi
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2016, 251
  • [45] Multidimensional SME performance evaluation:: Upgrading to data warehousing & data mining techniques
    Delisle, S
    Dugré, M
    St-Pierre, J
    IKE '04: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE ENGNINEERING, 2004, : 371 - 377
  • [46] Studying Data Mining and Data Warehousing with Different E-Learning System
    AlAjmi, Mohamed F.
    Khan, Shakir
    Sharma, Arun
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2013, 4 (01) : 144 - 147
  • [47] The importance of a "Data Mining oriented analysis phase" in a data warehousing project methodology
    Tramontana, B
    DATA MINING III, 2002, 6 : 417 - 423
  • [48] Data warehousing tool's architecture: From multidimensional analysis to data mining
    Lehn, R
    Lambert, V
    Nachouki, MP
    EIGHTH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 1997, : 636 - 643
  • [49] Data warehousing as a basis for web-based documentation of data mining and analysis
    Karlsson, J
    Eklund, P
    Hallgren, CG
    Sjödin, JG
    MEDICAL INFORMATICS EUROPE '99, 1999, 68 : 423 - 427
  • [50] Towards Near Real-Time Data Warehousing
    Chen, Li
    Rahayu, Wenny
    Taniar, David
    2010 24TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA), 2010, : 1150 - 1157