GENCODE 2025: reference gene annotation for human and mouse

被引:3
|
作者
Mudge, Jonathan M. [1 ]
Carbonell-Sala, Silvia [2 ]
Diekhans, Mark [3 ]
Martinez, Jose Gonzalez [1 ]
Hunt, Toby [1 ]
Jungreis, Irwin [4 ,5 ]
Loveland, Jane E. [1 ]
Arnan, Carme [2 ]
Barnes, If [1 ]
Bennett, Ruth [1 ]
Berry, Andrew [1 ]
Bignell, Alexandra [1 ]
Cerdan-Velez, Daniel [6 ]
Cochran, Kelly [7 ]
Cortes, Lucas T. [1 ]
Davidson, Claire [1 ]
Donaldson, Sarah [1 ]
Dursun, Cagatay [8 ,9 ]
Fatima, Reham [1 ]
Hardy, Matthew [1 ]
Hebbar, Prajna [3 ]
Hollis, Zoe [1 ]
James, Benjamin T. [4 ,5 ]
Jiang, Yunzhe [8 ,9 ]
Johnson, Rory [10 ,11 ]
Kaur, Gazaldeep [2 ]
Kay, Mike [1 ]
Mangan, Riley J. [4 ,5 ,12 ]
Maquedano, Miguel [6 ]
Martinez Gomez, Laura [6 ]
Mathlouthi, Nourhen [1 ]
Merritt, Ryan [1 ]
Ni, Pengyu [8 ,9 ]
Palumbo, Emilio [2 ]
Perteghella, Tamara [2 ,13 ]
Pozo, Fernando [6 ]
Raj, Shriya [1 ]
Sisu, Cristina [9 ,14 ]
Steed, Emily [1 ]
Sumathipala, Dulika [1 ]
Suner, Marie-Marthe [1 ]
Uszczynska-Ratajczak, Barbara [15 ]
Wass, Elizabeth [1 ]
Yang, Yucheng T. [9 ,16 ]
Zhang, Dingyao [8 ,9 ]
Finn, Robert D. [1 ]
Gerstein, Mark [8 ,9 ]
Guigo, Roderic [2 ,13 ]
Hubbard, Tim J. P. [17 ,18 ]
Kellis, Manolis [4 ,5 ]
机构
[1] European Mol Biol Lab, European Bioinformat Inst, Wellcome Genome Campus, Cambridge CB10 1SD, England
[2] Barcelona Inst Sci & Technol, Ctr Genom Regulat CRG, Dr Aiguader 88, Barcelona 08003, Catalonia, Spain
[3] Univ Calif Santa Cruz, Genom Inst, 2300 Delaware Ave, Santa Cruz, CA 95060 USA
[4] MIT, Comp Sci & Artificial Intelligence Lab, 32 Vassar St, Cambridge, MA 02139 USA
[5] Broad Inst MIT & Harvard, 415 Main St, Cambridge, MA 02142 USA
[6] Spanish Natl Canc Res Ctr CNIO, Bioinformat Unit, Calle Melchor Fernandez Almagro 3, Madrid 28029, Spain
[7] Stanford Univ, Dept Comp Sci, 353 Jane Stanford Way, Stanford, CA 94305 USA
[8] Yale Univ, Program Computat Biol & Bioinformat, New Haven, CT 06520 USA
[9] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT 06520 USA
[10] Bern Univ Hosp, Dept Med Oncol, Murtenstr 35, CH-3008 Bern, Switzerland
[11] Univ Coll Dublin, Sch Biol & Environm Sci, Dublin D04 V1W8 4, Ireland
[12] Harvard Med Sch, Genet Training Program, Boston, MA 02115 USA
[13] Univ Pompeu Fabra, Dept Ciencies Expt & Salut, Carrer Merce 12, Barcelona 08002, Spain
[14] Brunel Univ London, Dept Life Sci, Kingston Lane, London UB8 3PH, England
[15] Polish Acad Sci, Inst Bioorgan Chem, Dept Computat Biol Noncoding RNA, Noskowskiego 12-14, PL-61704 Poznan, Poland
[16] Fudan Univ, Inst Sci & Technol Brain Inspired Intelligence, 220 Handan Rd, Shanghai 200433, Peoples R China
[17] Kings Coll London, Guys Hosp, Dept Med & Mol Genet, London SE1 9RT, England
[18] ELIXIR Hub, Wellcome Genome Campus, Cambridge CB10 1SD, England
[19] Stanford Univ, Dept Genet, Stanford, CA 94305 USA
基金
美国国家卫生研究院; 英国惠康基金;
关键词
SEQUENCE;
D O I
10.1093/nar/gkae1078
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
GENCODE produces comprehensive reference gene annotation for human and mouse. Entering its twentieth year, the project remains highly active as new technologies and methodologies allow us to catalog the genome at ever-increasing granularity. In particular, long-read transcriptome sequencing enables us to identify large numbers of missing transcripts and to substantially improve existing models, and our long non-coding RNA catalogs have undergone a dramatic expansion and reconfiguration as a result. Meanwhile, we are incorporating data from state-of-the-art proteomics and Ribo-seq experiments to fine-tune our annotation of translated sequences, while further insights into function can be gained from multi-genome alignments that grow richer as more species' genomes are sequenced. Such methodologies are combined into a fully integrated annotation workflow. However, the increasing complexity of our resources can present usability challenges, and we are resolving these with the creation of filtered genesets such as MANE Select and GENCODE Primary. The next challenge is to propagate annotations throughout multiple human and mouse genomes, as we enter the pangenome era. Our resources are freely available at our web portal www.gencodegenes.org, and via the Ensembl and UCSC genome browsers. [GRAPHICS] .
引用
收藏
页码:D966 / D975
页数:10
相关论文
共 50 条
  • [1] GENCODE reference annotation for the human and mouse genomes
    Frankish, Adam
    Diekhans, Mark
    Ferreira, Anne-Maud
    Johnson, Rory
    Jungreis, Irwin
    Loveland, Jane
    Mudge, Jonathan M.
    Sisu, Cristina
    Wright, James
    Armstrong, Joel
    Barnes, If
    Berry, Andrew
    Bignell, Alexandra
    Sala, Silvia Carbonell
    Chrast, Jacqueline
    Cunningham, Fiona
    Di Domenico, Tomas
    Donaldson, Sarah
    Fiddes, Ian T.
    Giron, Carlos Garcia
    Gonzalez, Jose Manuel
    Grego, Tiago
    Hardy, Matthew
    Hourlier, Thibaut
    Hunt, Toby
    Izuogu, Osagie G.
    Lagarde, Julien
    Martin, Fergal J.
    Martinez, Laura
    Mohanan, Shamika
    Muir, Paul
    Navarro, Fabio C. P.
    Parker, Anne
    Pei, Baikang
    Pozo, Fernando
    Ruffier, Magali
    Schmitt, Bianca M.
    Stapleton, Eloise
    Suner, Marie-Marthe
    Sycheva, Irina
    Uszczynska-Ratajczak, Barbara
    Xu, Jinuri
    Yates, Andrew
    Zerbino, Daniel
    Zhang, Yan
    Aken, Bronwen
    Choudhary, Jyoti S.
    Gerstein, Mark
    Guigo, Roderic
    Hubbard, Tim J. P.
    NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) : D766 - D773
  • [2] GENCODE: reference annotation for the human and mouse genomes in 2023
    Frankish, Adam
    Carbonell-Sala, Silvia
    Diekhans, Mark
    Jungreis, Irwin
    Loveland, Jane E.
    Mudge, Jonathan M.
    Sisu, Cristina
    Wright, James C.
    Arnan, Carme
    Barnes, If
    Banerjee, Abhimanyu
    Bennett, Ruth
    Berry, Andrew
    Bignell, Alexandra
    Boix, Carles
    Calvet, Ferriol
    Cerdan-Velez, Daniel
    Cunningham, Fiona
    Davidson, Claire
    Donaldson, Sarah
    Dursun, Cagatay
    Fatima, Reham
    Giorgetti, Stefano
    Giron, Carlos Garcia
    Gonzalez, Jose Manuel
    Hardy, Matthew
    Harrison, Peter W.
    Hourlier, Thibaut
    Hollis, Zoe
    Hunt, Toby
    James, Benjamin
    Jiang, Yunzhe
    Johnson, Rory
    Kay, Mike
    Lagarde, Julien
    Martin, Fergal J.
    Gomez, Laura Martinez
    Nair, Surag
    Ni, Pengyu
    Pozo, Fernando
    Ramalingam, Vivek
    Ruffier, Magali
    Schmitt, Bianca M.
    Schreiber, Jacob M.
    Steed, Emily
    Suner, Marie-Marthe
    Sumathipala, Dulika
    Sycheva, Irina
    Uszczynska-Ratajczak, Barbara
    Wass, Elizabeth
    NUCLEIC ACIDS RESEARCH, 2023, 51 (D1) : D942 - D949
  • [3] GENCODE: The reference human genome annotation for The ENCODE Project
    Harrow, Jennifer
    Frankish, Adam
    Gonzalez, Jose M.
    Tapanari, Electra
    Diekhans, Mark
    Kokocinski, Felix
    Aken, Bronwen L.
    Barrell, Daniel
    Zadissa, Amonida
    Searle, Stephen
    Barnes, If
    Bignell, Alexandra
    Boychenko, Veronika
    Hunt, Toby
    Kay, Mike
    Mukherjee, Gaurab
    Rajan, Jeena
    Despacio-Reyes, Gloria
    Saunders, Gary
    Steward, Charles
    Harte, Rachel
    Lin, Michael
    Howald, Cedric
    Tanzer, Andrea
    Derrien, Thomas
    Chrast, Jacqueline
    Walters, Nathalie
    Balasubramanian, Suganthi
    Pei, Baikang
    Tress, Michael
    Manuel Rodriguez, Jose
    Ezkurdia, Iakes
    van Baren, Jeltje
    Brent, Michael
    Haussler, David
    Kellis, Manolis
    Valencia, Alfonso
    Reymond, Alexandre
    Gerstein, Mark
    Guigo, Roderic
    Hubbard, Tim J.
    GENOME RESEARCH, 2012, 22 (09) : 1760 - 1774
  • [4] GENCODE: producing a reference annotation for ENCODE
    Jennifer Harrow
    France Denoeud
    Adam Frankish
    Alexandre Reymond
    Chao-Kung Chen
    Jacqueline Chrast
    Julien Lagarde
    James GR Gilbert
    Roy Storey
    David Swarbreck
    Colette Rossier
    Catherine Ucla
    Tim Hubbard
    Stylianos E Antonarakis
    Roderic Guigo
    Genome Biology, 7
  • [5] GENCODE: producing a reference annotation for ENCODE
    Harrow, Jennifer
    Denoeud, France
    Frankish, Adam
    Reymond, Alexandre
    Chen, Chao-Kung
    Chrast, Jacqueline
    Lagarde, Julien
    Gilbert, James Gr
    Storey, Roy
    Swarbreck, David
    Rossier, Colette
    Ucla, Catherine
    Hubbard, Tim
    Antonarakis, Stylianos E.
    Guigo, Roderic
    GENOME BIOLOGY, 2006, 7 (Suppl 1)
  • [6] Improving GENCODE reference gene annotation using a high-stringency proteogenomics workflow
    Wright, James C.
    Mudge, Jonathan
    Weisser, Hendrik
    Barzine, Mitra P.
    Gonzalez, Jose M.
    Brazma, Alvis
    Choudhary, Jyoti S.
    Harrow, Jennifer
    NATURE COMMUNICATIONS, 2016, 7
  • [7] Comparison of GENCODE and RefSeq gene annotation and the impact of reference geneset on variant effect prediction
    Frankish, Adam
    Uszczynska, Barbara
    Ritchie, Graham R. S.
    Gonzalez, Jose M.
    Pervouchine, Dmitri
    Petryszak, Robert
    Mudge, Jonathan M.
    Fonseca, Nuno
    Brazma, Alvis
    Guigo, Roderic
    Harrow, Jennifer
    BMC GENOMICS, 2015, 16
  • [8] Comparison of GENCODE and RefSeq gene annotation and the impact of reference geneset on variant effect prediction
    Adam Frankish
    Barbara Uszczynska
    Graham RS Ritchie
    Jose M Gonzalez
    Dmitri Pervouchine
    Robert Petryszak
    Jonathan M Mudge
    Nuno Fonseca
    Alvis Brazma
    Roderic Guigo
    Jennifer Harrow
    BMC Genomics, 16
  • [9] Improving GENCODE reference gene annotation using a high-stringency proteogenomics workflow
    James C. Wright
    Jonathan Mudge
    Hendrik Weisser
    Mitra P. Barzine
    Jose M. Gonzalez
    Alvis Brazma
    Jyoti S. Choudhary
    Jennifer Harrow
    Nature Communications, 7
  • [10] The GENCODE human gene set
    S Searle
    A Frankish
    A Bignell
    B Aken
    T Derrien
    M Diekhans
    R Harte
    C Howald
    F Kokocinski
    M Lin
    M Tress
    M Van Baren
    I Barnes
    T Hunt
    D Carvalho-Silva
    C Davidson
    S Donaldson
    J Gilbert
    M Kay
    D Lloyd
    J Loveland
    J Mudge
    C Snow
    J Vamathevan
    L Wilming
    M Brent
    M Gerstein
    R Guigó
    M Kellis
    A Reymond
    A Zadissa
    A Valencia
    J Harrow
    T Hubbard
    Genome Biology, 11 (Suppl 1)