NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy

被引:851
|
作者
Pruitt, Kim D. [1 ]
Tatusova, Tatiana [1 ]
Brown, Garth R. [1 ]
Maglott, Donna R. [1 ]
机构
[1] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
RESOURCES; DATABASE;
D O I
10.1093/nar/gkr1079
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of genomic, transcript and protein sequence records. These records are selected and curated from public sequence archives and represent a significant reduction in redundancy compared to the volume of data archived by the International Nucleotide Sequence Database Collaboration. The database includes over 16 000 organisms, 2.4 x 10(6) genomic records, 13 x 10(6) proteins and 2 x 10(6) RNA records spanning prokaryotes, eukaryotes and viruses (RefSeq release 49, September 2011). The RefSeq database is maintained by a combined approach of automated analyses, collaboration and manual curation to generate an up-to-date representation of the sequence, its features, names and cross-links to related sources of information. We report here on recent growth, the status of curating the human RefSeq data set, more extensive feature annotation and current policy for eukaryotic genome annotation via the NCBI annotation pipeline. More information about the resource is available online (see http://www.ncbi.nlm.nih.gov/RefSeq/).
引用
收藏
页码:D130 / D135
页数:6
相关论文
共 42 条
  • [1] NCBI Reference Sequences: current status, policy and new initiatives
    Pruitt, Kim D.
    Tatusova, Tatiana
    Klimke, William
    Maglott, Donna R.
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : D32 - D36
  • [2] Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation
    O'Leary, Nuala A.
    Wright, Mathew W.
    Brister, J. Rodney
    Ciufo, Stacy
    McVeigh, Diana Haddad Rich
    Rajput, Bhanu
    Robbertse, Barbara
    Smith-White, Brian
    Ako-Adjei, Danso
    Astashyn, Alexander
    Badretdin, Azat
    Bao, Yiming
    Blinkova, Olga
    Brover, Vyacheslav
    Chetvernin, Vyacheslav
    Choi, Jinna
    Cox, Eric
    Ermolaeva, Olga
    Farrell, Catherine M.
    Goldfarb, Tamara
    Gupta, Tripti
    Haft, Daniel
    Hatcher, Eneida
    Hlavina, Wratko
    Joardar, Vinita S.
    Kodali, Vamsi K.
    Li, Wenjun
    Maglott, Donna
    Masterson, Patrick
    McGarvey, Kelly M.
    Murphy, Michael R.
    O'Neill, Kathleen
    Pujar, Shashikant
    Rangwala, Sanjida H.
    Rausch, Daniel
    Riddick, Lillian D.
    Schoch, Conrad
    Shkeda, Andrei
    Storz, Susan S.
    Sun, Hanzhen
    Thibaud-Nissen, Francoise
    Tolstoy, Igor
    Tully, Raymond E.
    Vatsan, Anjana R.
    Wallin, Craig
    Webb, David
    Wu, Wendy
    Landrum, Melissa J.
    Kimchi, Avi
    Tatusova, Tatiana
    [J]. NUCLEIC ACIDS RESEARCH, 2016, 44 (D1) : D733 - D745
  • [3] NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins
    Pruitt, Kim D.
    Tatusova, Tatiana
    Maglott, Donna R.
    [J]. NUCLEIC ACIDS RESEARCH, 2007, 35 : D61 - D65
  • [4] NCBI Reference Sequence Project: update and current status
    Pruitt, KD
    Tatusova, T
    Maglott, DR
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 34 - 37
  • [5] Bovine Genome Database: new annotation tools for a new reference genome
    Shamimuzzaman, Md
    Le Tourneau, Justin J.
    Unni, Deepak R.
    Diesh, Colin M.
    Triant, Deborah A.
    Walsh, Amy T.
    Tayal, Aditi
    Conant, Gavin C.
    Hagen, Darren E.
    Elsik, Christine G.
    [J]. NUCLEIC ACIDS RESEARCH, 2020, 48 (D1) : D676 - D681
  • [6] Genofunc: genome annotation and identification of genome features for automated pipelining analysis of virus whole genome sequences
    Xiaoyu Yu
    [J]. BMC Bioinformatics, 24
  • [7] Genofunc: genome annotation and identification of genome features for automated pipelining analysis of virus whole genome sequences
    Yu, Xiaoyu
    [J]. BMC BIOINFORMATICS, 2023, 24 (01)
  • [8] The TIGR Rice Genome Annotation Resource: Improvements and new features
    Ouyang, Shu
    Zhu, Wei
    Hamilton, John
    Lin, Haining
    Campbell, Matthew
    Childs, Kevin
    Thibaud-Nissen, Francoise
    Malek, Renae L.
    Lee, Yuandan
    Zheng, Li
    Orvis, Joshua
    Haas, Brian
    Wortman, Jennifer
    Buell, C. Robin
    [J]. NUCLEIC ACIDS RESEARCH, 2007, 35 : D883 - D887
  • [9] Current status of whole-genome sequences of Korean angiosperms
    Park, Jongsun
    Yun, Yunho
    Xi, Hong
    Kwon, Woochan
    Son, Janghyuk
    [J]. KOREAN JOURNAL OF PLANT TAXONOMY, 2023, 53 (03): : 181 - 200
  • [10] Consistent genome re-annotation of the 7 Coxiella burnetii reference genomes reveals new genomic features.
    Frangoulidis, D.
    Muensterkoetter, M.
    Walter, M. C.
    [J]. INTERNATIONAL JOURNAL OF MEDICAL MICROBIOLOGY, 2012, 302 : 127 - 127