The Dfam database of repetitive DNA families

被引:441
|
作者
Hubley, Robert [1 ]
Finn, Robert D. [2 ]
Clements, Jody [3 ]
Eddy, Sean R. [4 ]
Jones, Thomas A. [4 ]
Bao, Weidong [5 ]
Smit, Arian F. A. [1 ]
Wheelers, Travis J. [6 ]
机构
[1] Inst Syst Biol, Seattle, WA 98109 USA
[2] European Bioinformat Inst EMBL EBI, European Mol Biol Lab, Wellcome Trust Genome Campus, Cambridge CB10 1RQ, England
[3] HHMI Janelia Res Campus, Ashburn, VA 20147 USA
[4] Harvard Univ, Howard Hughes Med Inst, Cambridge, MA 02138 USA
[5] Genet Informat Res Inst, Los Altos, CA 94022 USA
[6] Univ Montana, Missoula, MT 59812 USA
基金
美国国家卫生研究院;
关键词
DE-NOVO IDENTIFICATION; INTERSPERSED REPEATS; ELEMENTS; ORGANIZATION; MATRICES; REPBASE; SEARCH; MOUSE; SINES;
D O I
10.1093/nar/gkv1272
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The initial release of Dfam, featured in the 2013 NAR Database Issue, contained 1143 families of repetitive elements found in humans, and was used to produce more than 100 Mb of additional annotation of TE-derived regions in the human genome, with improved speed. Here, we describe recent advances, most notably expansion to 4150 total families including a comprehensive set of known repeat families from four new organisms (mouse, zebrafish, fly and nematode). We describe improvements to coverage, and to our methods for identifying and reducing false annotation. We also describe updates to the website interface. The Dfam website has moved to http://dfam.org. Seed alignments, profile HMMs, hit lists and other underlying data are available for download.
引用
收藏
页码:D81 / D89
页数:9
相关论文
共 50 条
  • [1] Dfam: a database of repetitive DNA based on profile hidden Markov models
    Wheeler, Travis J.
    Clements, Jody
    Eddy, Sean R.
    Hubley, Robert
    Jones, Thomas A.
    Jurka, Jerzy
    Smit, Arian F. A.
    Finn, Robert D.
    NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) : D70 - D82
  • [2] BAMHI AND HINDIII REPETITIVE DNA FAMILIES IN THE RICE GENOME
    MAWAL, Y
    LAGU, M
    MOON, E
    CHANG, S
    CHUNG, MC
    WU, HK
    GUPTA, V
    RANJEKAR, P
    WU, R
    GENOME, 1995, 38 (02) : 191 - 200
  • [3] EVIDENCE FOR TRANSPOSITION OF DISPERSED REPETITIVE DNA FAMILIES IN YEAST
    CAMERON, JR
    LOH, EY
    DAVIS, RW
    CELL, 1979, 16 (04) : 739 - 751
  • [4] REPETITIVE DNA-SEQUENCE FAMILIES IN CREPIS-CAPILLARIS
    JAMILENA, M
    REJON, CR
    REJON, MR
    CHROMOSOMA, 1993, 102 (04) : 272 - 278
  • [6] 2-DIMENSIONAL GEL ANALYSIS OF REPETITIVE DNA FAMILIES
    SHEPPARD, RD
    SILVER, LM
    GUIDE TO TECHNIQUES IN MOUSE DEVELOPMENT, 1993, 225 : 701 - 715
  • [7] DIFFERENT FAMILIES OF REPETITIVE DNA ON THE Y-CHROMOSOME OF DROSOPHILA HYDEI
    AWGULEWITSCH, A
    WLASCHEK, M
    BUNEMANN, H
    BIOLOGICAL CHEMISTRY HOPPE-SEYLER, 1985, 366 (02): : 113 - 113
  • [8] Effective Isolation of Retrotransposons and Repetitive DNA Families from the Wheat Genome
    Tomita, Motonori
    Asao, Munenori
    Kuraki, Aya
    JOURNAL OF INTEGRATIVE PLANT BIOLOGY, 2010, 52 (07) : 679 - 691
  • [9] Effective Isolation of Retrotransposons and Repetitive DNA Families from the Wheat Genome
    Motonori Tomita
    Munenori Asao
    Aya Kuraki
    JournalofIntegrativePlantBiology, 2010, 52 (07) : 679 - 691
  • [10] The Dfam community resource of transposable element families, sequence models, and genome annotations
    Storer, Jessica
    Hubley, Robert
    Rosen, Jeb
    Wheeler, Travis J.
    Smit, Arian F.
    MOBILE DNA, 2021, 12 (01)