PgmNr Z6088: NCBI’s Zebrafish Genome Resources

.

Authors:
N. A. O'Leary; T. D. Murphy; K. D. Pruitt


Institutes
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, MD.


Abstract:

Genomic research on model organisms such as zebrafish (Danio rerio) increasingly relies on the availability of high-quality annotated reference genomes to facilitate consistent reporting and mapping of genetic data. The Reference Sequence (RefSeq) project (https://www.ncbi.nlm.nih.gov/refseq/) at the National Center for Biotechnology Information (NCBI) provides a comprehensive annotation of the zebrafish Tuebingen strain reference genome assembly (GRCz10) maintained by the Reference Genome Consortium (GRC).  The zebrafish RefSeq dataset is generated by a combination of computational analysis and manual curation that results in an annotation that focuses on representation of all full-length, non-redundant transcripts. The primary sources of data used in this annotation pipeline include mRNAs, expressed sequence tags (ESTs), protein data, RNA-seq data, and protein homology. Zebrafish is one of a select group of vertebrates that are the major focus of RefSeq’s manual curation efforts, which involves the in-depth review of sequence data to define new transcript variants, resolve sequence errors, and remove inaccurate information.  We also collaborate with expert groups, including the Zebrafish Information Network (ZFIN) and UniProtKB, to provide appropriate annotation and nomenclature for both genes and proteins. In addition to zebrafish, NCBI provides stable reference genome annotation for other fish species with high-quality genome assembly data submitted to NCBI’s Assembly resource (https://www.ncbi.nlm.nih.gov/assembly/). To date, 26 other fish species have RefSeq annotated genomes, providing a valuable resource for comparative genomic research. In this poster presentation we will provide an overview of NCBI’s zebrafish genome resources and highlight the utility of these resources to the zebrafish research community. We will also provide practical guidance on how to access RefSeq data and tools for analysis of individual genes as well as whole genome datasets.