PgmNr M5048: Integration of heterogeneous cross-species functional genomics data in GeneWeaver.org
.

Authors:
J. A. Bubier 1 ; G. Sutphin 1 ; M. A. Langston 2 ; E. J. Baker 3 ; E. J. Chesler 1


Institutes
1) The Jackson Laboratory, Bar Harbor, ME; 2) University of Tennessee, Knoxville, TN; 3) Baylor University, Waco, TX.


Abstract:

The use of model organisms to understand mechanisms of phenotypic diversity, development, aging, health and disease has been highly productive but faces challenges due to the unique nature of each model organism and its characteristics. The widespread application of whole genome functional studies and the diversity of sequenced genomes has created the critical mass of data necessary for efficient large-scale and cross-species data integration. The conservation of underlying pathways across species enables us to identify generalized mechanisms of disease. GeneWeaver.org is a database and suite of tools that allows users to integrate, query and analyze heterogeneous data from 10 supported species, with a variety of research applications. User submitted gene sets from individual or bulk uploads are seamlessly integrated and analyzed in light of a database containing sets of genes corresponding to Gene Ontology, Mammalian Phenotype Ontology, Comparative Toxicogenomics Database Chemicals, OMIM, MeSH annotations as well as pathway based gene sets from resources such as Pathway Commons, Kyoto Encyclopedia of Genes and Genomes (KEGG), and Molecular Signatures Database (MSigDB). The available tools use statistical methods and graph-based algorithms to perform set-set matching operations on gene sets and network edges. The results are presented in a dynamic graphical output. As an example, we applied GeneWeaver to a set of 73 gene sets derived from diverse experimental types related to aging across six species (yeast, worm, fly, rat, mouse and human). The most highly connected gene among the cross species gene sets was identified as the tetraspanin transmembrane protein family member tsp-7 (Cd63 in mice). Experimental validation of tsp-7 using bacterial-fed RNAi in C. elegans demonstrated a 10.2% extension of mean lifespan compared to empty vector (p=0.009, n=627). This example illustrates how aggregating experimental evidence of a variety of data types (differential RNA expression, QTL, proteomic, etc.) enables the discovery of novel genes common to conserved processes.