PgmNr P2061: Detailed structure and variation of complex satellite DNA loci in Drosophila melanogaster.

Authors:
D. E. Khost; D. Eickbush; A. M. Larracuente


Institutes
University of Rochester, Rochester, NY.


Abstract:

Large blocks of tandemly repeated heterochromatic sequences called satellite DNAs (satDNAs) can make up a substantial fraction of most eukaryotic genomes. SatDNAs evolve rapidly and may contribute to genetic incompatibilities between closely related species. The variation in heterochromatic sequences, including satDNAs, within species is associated with variation in fitness, genome-wide gene expression and male fertility. Despite their prevalence, satDNAs remain poorly understood and understudied, largely due to the technical difficulty in sequencing and assembling such large, highly repetitive areas of the genome. Advances in single-molecule real-time (SMRT) Pacific Biosciences (PacBio) sequencing help overcome the limitations of traditional sequencing methods for some repeat-rich sequences. We use PacBio sequencing to reveal the detailed structure and organization of two complex satDNA loci in Drosophila melanogaster: a 120 bp repeat called Responder (Rsp) and a 260 bp repeat in the 1.688 gm/cm3 satellite family (260-bp). We report on the optimal assembly strategies for regions rich in complex tandem repeats and the complete assemblies of Rsp and 260-bp as supported by computational and molecular validation. Both Rsp and 260-bp show high levels of repeat homogenization within their arrays, particularly over the center, indicating that they are undergoing concerted evolution. Sequence variants of the repeats are non-randomly distributed, tending to be located near the distal and proximal ends of their arrays. The Rsp locus possesses several additional interesting levels of organization: two islands of transposable elements occur approximately 100 kb apart near the proximal and distal boundaries of the Rsp locus, but in an inverted orientation. The Rsp sequences in their vicinity have identical partners on opposite sides of the array. This structure suggests several recent, complicated duplication/inversion events and/or recent gene conversion have helped shape the Rsp locus. We also use Illumina sequence reads to investigate polymorphism in the organization and abundance of these satDNAs in population samples of D. melanogaster from across the globe. We show that the size and composition of the Rsp and 260-bp loci vary across populations and infer the mechanism and location of array expansion/contraction to be through unequal crossing over at the array center. We find an approximately tenfold and fourfold variation in locus size for Rsp and 260bp, respectively. Overall, the unprecedented level of detail that we have in our assemblies allows us to begin to answer fundamental questions about the evolution and dynamics of repetitive regions.