Computational Identification of piRNAs Using Features Based on RNA Sequence, Structure, Thermodynamic and Physicochemical Properties

被引:19
|
作者
Monga, Isha [1 ]
Banerjee, Indranil [1 ]
机构
[1] Indian Inst Sci Educ & Res Mohali IISER Mohali, Cellular Virol Lab, Dept Biol Sci, Sect 81, Sas Nagar 140306, Mohali, India
关键词
piRNA; classification; algorithm; prediction; non-coding RNA; physicochemical; PIWI-INTERACTING RNAS; MESSENGER-RNAS; BIOGENESIS; SIRNAS; PREDICTION; PROTEINS;
D O I
10.2174/1389202920666191129112705
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Rationale: PIWI-interacting RNAs (piRNAs) are a recently-discovered class of small noncoding RNAs (ncRNAs) with a length of 21-35 nucleotides. They play a role in gene expression regulation, transposon silencing, and viral infection inhibition. Once considered as "dark matter" of ncRNAs, piRNAs emerged as important players in multiple cellular functions in different organisms. However, our knowledge of pi RNAs is still very limited as many pi RNAs have not been yet identified due to lack of robust computational predictive tools. Methods: To identify novel piRNAs, we developed piRNAPred, an integrated framework for piRNA prediction employing hybrid features like k-mer nucleotide composition, secondary structure, thermodynamic and physicochemical properties. A non-redundant dataset (D-3349 or D1684p+1665n ) comprising 1684 experimentally verified piRNAs and 1665 non-piRNA sequences was obtained from piRBase and NONCODE, respectively. These sequences were subjected to the computation of various sequence-structure based features in binary format and trained using different machine learning techniques, of which support vector machine (SVM) performed the best. Results: During the ten-fold cross-validation approach (10-CV), piRNAPred achieved an overall accuracy of 98.60% with Mathews correlation coefficient (MCC) of 0.97 and receiver operating characteristic (ROC) of 0.99. Furthermore, we achieved a dimensionality reduction of feature space using an attribute selected classifier. Conclusion: We obtained the highest performance in accurately predicting piRNAs as compared to the current state-of-the-art piRNA predictors. In conclusion, piRNAPred would be helpful to expand the piRNA repertoire, and provide new insights on piRNA functions.
引用
收藏
页码:508 / 518
页数:11
相关论文
共 50 条
  • [21] Computational study of physicochemical, optical, and thermodynamic properties of 2,2-dimethylchromene derivatives
    J. R. Eone
    Y. Tadjouteu Assatse
    R. A. Yossa Kamsi
    M. T. Ottou Abe
    J. M. B. Ndjaka
    Journal of Molecular Modeling, 2023, 29
  • [22] Computational study of physicochemical, optical, and thermodynamic properties of 2,2-dimethylchromene derivatives
    Eone, J. R.
    Assatse, Y. Tadjouteu
    Kamsi, R. A. Yossa
    Abe, M. T. Ottou
    Ndjaka, J. M. B.
    JOURNAL OF MOLECULAR MODELING, 2023, 29 (04)
  • [23] Studies of RNA Sequence and Structure Using Nanopores
    Henley, Robert Y.
    Carson, Spencer
    Wanunu, Meni
    NANOTECHNOLOGY TOOLS FOR THE STUDY OF RNA, 2016, 139 : 73 - 99
  • [24] CHEMICAL STRUCTURE IDENTIFICATION IN METABOLOMICS: COMPUTATIONAL MODELING OF EXPERIMENTAL FEATURES
    Menikarachchi, Lochana C.
    Hamdalla, Mai A.
    Hill, Dennis W.
    Grant, David F.
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2013, 5 (06):
  • [25] DHUpredET: A comparative computational approach for identification of dihydrouridine modification sites in RNA sequence
    Sultan, Md Fahim
    Karim, Tasmin
    Shaon, Md Shazzad Hossain
    Azim, Sayed Mehedi
    Dehzangi, Iman
    Akter, Mst Shapna
    Ibrahim, Sobhy M.
    Ali, Md Mamun
    Ahmed, Kawsar
    Bui, Francis M.
    ANALYTICAL BIOCHEMISTRY, 2025, 702
  • [26] Sequence based human leukocyte antigen gene prediction using informative physicochemical properties
    Shoombuatong, Watshara
    Mekha, Panuwat
    Chaijaruwanich, Jeerayut
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 13 (03) : 211 - 224
  • [27] Identification of RNA modification enzymes using sequence homology
    Ansmant, I
    Motorin, Y
    MOLECULAR BIOLOGY, 2001, 35 (02) : 206 - 223
  • [28] Identification of RNA Modification Enzymes Using Sequence Homology
    I. Ansmant
    Y. Motorin
    Molecular Biology, 2001, 35 : 206 - 223
  • [29] Computational identification of Plasmodium falciparum RNA pseudouridylate synthase as a viable drug target, its physicochemical properties, 3D structure prediction and prediction of potential inhibitors
    Afolabi, Rufus
    Chinedu, Shalom
    Ajamma, Yvonne
    Adam, Yagoub
    Koenig, Rainer
    Adebiyi, Ezekiel
    INFECTION GENETICS AND EVOLUTION, 2022, 97
  • [30] Sequence features of variable region determining physicochemical properties and polyreactivity of therapeutic antibodies
    Lecerf, Maxime
    Kanyavuz, Alexia
    Lacroix-Desmazes, Sebastien
    Dimitrov, Jordan D.
    MOLECULAR IMMUNOLOGY, 2019, 112 : 338 - 346