PgmNr P2028: Comprehensive genome-wide disease characterization (URSA(HD)) and tissue-specific networks (GIANT) guide discovery and functional elucidation of novel predicted disease-associated genes.

Authors:
Chandra Theesfeld 1 ; Young-suk Lee 1


Institutes
1) Princeton University, Princeton, NJ; 2) Simons Foundation, New York, NY; 3) University of Pennsylvania, Philadelphia, PA; 4) Geisel School of Medicine at Dartmouth, Hanover, NH; 5) University of California, San Francisco, San Francisco, CA; 6) Icahn School of Medicine at Mount Sinai, New York, NY; 7) Brigham and Women's Hospital and Harvard Medical School, Boston, MA.


Abstract:

Complex diseases are driven by multiple genetic changes and characterized by genome-wide perturbations of cellular pathways and functions. Gene expression profiling experiments comparing normal to disease samples while useful in uncovering molecular pathology of diseases, are limited in that they cannot discern similarities between related diseases. Discovery of truly-disease specific attributes requires a comprehensive approach comparing many diseases and many normal samples. We have developed URSA(HD), a unified probabilistic framework to identify and quantify distinctive disease signals for 309 human diseases based on gene expression profiles of clinical samples. URSA(HD) can uncover subtle differences between similar diseases and highlights identifiable aspects of rare diseases. URSA(HD) outperforms other approaches of using individual disease genes or the typical normal/disease differential expression analyses. URSA(HD) can also  be used by researchers to make sample predictions(both cancerous and non-cancerous) for a given clinical gene expression profile (ursa.princeton.edu).

URSA’s resulting biological models constitute feature sets specific to each disease: different from those for all other diseases (including similar diseases), and different from all normal tissue. In the biological model for neuroblastoma, a pediatric cancer, 16 of the top 20 genes are documented causal or biomarker genes, and the remaining four genes are uncharacterized. We tested the relevance of these four genes and found clear growth or cell migration phenotypes for three genes in one or two different human neurobastoma cell lines. Since literature is limited or non-existent for these four  genes, further characterization of their roles will require consideration of potential roles in cell and tissue physiology. URSA(HD) models retain only the tissue signatures that are relevant to the disease over and above normal tissue signal. Genome-wide functional studies and access inclusive tissue information and predictions, we have developed human genome-wide tissue-specific networks: GIANT.princeton.edu. With these networks, known and predicted tissue-specific roles played by proteins, including widely expressed proteins, are accessible. Associations within these networks are directly usable by scientists for hypothesis generation and testing and will be useful.