PgmNr P376: Large scale splicing QTL analysis of cancer genomes.

Authors:
Kjong-Van Lehmann 1 ; Andre Kahles 1 ; Cyriac Kandoth 1 ; William Lee 1 ; Nikolaus Schultz 1 ; Oliver Stegle 2 ; Gunnar Rätsch 1


Institutes
1) Memorial Sloan-Kettering Cancer Center; 1275 York avenue; New York City, NY 10065; 2) European Bioinformatics Institute; Hinxton; Cambridge; CB10 1SD; United Kingdom.


Abstract:

The large scale efforts of molecular characterizing thousands of different tumors of The Cancer Genome Atlas (TCGA) network, have enabled new opportunities to undertake quantitative trait analysis at unprecedented sample sizes.To facilitate the joint analysis across various TCGA projects, we have re-aligned and re-analyzed RNA and whole sequencing data of ~4.000 individuals comprising 11 cancer types. RNA-seq data has been processed with SplAdder in order to quantify gene expression and splicing changes, reflecting cancer-specific and tissue-specific splicing variability. We observe a threefold increase in splicing events compared to GENCODE annotation and estimate an increase of ~20% of splicing complexity in tumor samples. In order to account not only for population structure, which is the most common confounding factor in QTL analysis, but also somatic mutations, recurrence patterns and sample heterogeneity, we employ a mixed model allowing us to model tumor specific genotypic and phenotypic heterogeneity. The large sample size in TCGA allows us to not only find local splicing QTL's but also to detect large effect long range changes, affecting the function of splicing factors. We observe splicing mutations in U2AF1 and SF3B1 caused by somatic alterations. We identify various somatic and germline mutations inducing splicing alterations in many genes and insight into their effects may contribute towards a better understanding of cancer development, progression and treatment.