PgmNr M5052: Functional annotation of proteoforms in the Mouse Genome Database using the Protein Ontology.

Authors:
H. J. Drabkin 1 ; K. R. Christie 1 ; C. N. Arighi 2 ; C. H. Wu 2 ; J. A. Blake 1


Institutes
1) The Jackson Laboratory, Bar Harbor, ME; 2) University of Delaware, Newark, DE.


Abstract:

The concept of one gene/one polypeptide suggested in the 40’s and 50’s was dispelled in the late 70’s with the discovery of splicing. A single eukaryotic gene can encode multiple protein isoforms due to the usage of alternate promoters or polyadenylation sites, alternative splicing of the primary transcript to generate different mRNAs, and/or selection of alternative start sites during translation of an mRNA. A protein can be further subjected to a single or multiple of post-translational processing including proteolytic cleavage as well as protein amino acid modifications.  The functioning or cellular location of these different protein entities (proteoforms) can often be quite different. 

The Protein Ontology (PRO, http://proconsortium.org) is a resource that supplies unique identifiers to specific proteoforms resulting from expression of a gene. These forms are organized in an ontological framework that explicitly describes how these entities relate.  The ontology currently has over 68,000 isoforms and 2,200 modified proteoforms, which are either imported from high-quality sources or added via literature-based annotation by PRO curators.

The Mouse Genome Informatics (MGI, http://www.informatics.jax.org) is the international database resource for the laboratory mouse.  It provides integrated genetic, genomic, and biological data to facilitate the study of human health and disease.  MGI uses the Gene Ontology (GO, http://www.geneontology.org) for functional annotation of mouse genes.  The GO defines concepts used to describe gene product functioning, location, and participation in biological processes, as well as relationships between these concepts. At MGI, when GO literature-based manual annotation applies to a specific proteoform this is indicated using PRO.  These annotations are grouped according to the encoding gene, and can be displayed at MGI, as well as at the Amigo database of the Gene Ontology Consortium (http://amigo.geneontology.org/amigo ).   The annotations are also provided to the PRO website, where they can be viewed in the context of other proteoforms.

Supported by NIH Grants  HG000330, HG002273, and GM080646.