PgmNr D1403: A high frequency of transposable element tandems is a potential source of new satellite arrays.

Authors:
M. P. McGurk; D. A. Barbash


Institutes
Cornell University, Ithaca, NY.


Keyword: genome evolution

Abstract:

Eukaryotic genomes are replete with repeated sequences, either dispersed across the genome by transposition events (transposable elements, TEs) or in large tandemly arrayed blocs (satellites). Satellite arrays commonly comprise the centromeres and telomeres, forming structures essential to cell division and replication, evolve by expansions and contractions that are probably largely neutral in nature, and have a high turnover rate. This high turnover must arise through processes of rapid gain and loss. While it is not fully clear how new complex satellite arrays arise, some known satellites  clearly originated from TEs, suggesting that TEs can provide source material for the emergence of new arrays.

With this in mind, we sought to leverage the abundance of available Next-Generation Sequencing (NGS) datasets to assess the extent to which transposable elements form tandem arrays, and how these tandems vary across populations of Drosophila. To circumvent the difficulties of mapping repeat-derived reads, we employ an alignment strategy that maps paired-end reads to the consensus sequences of known repeats and implement an Expectation-Maximization algorithm to discover tandem structures in the aligned data.

We applied this method to the hundreds of available D. melanogaster NGS datasets. With the exception of a few previously known examples, we rarely detect large tandem arrays of TEs in Drosophila melanogaster, however we find that small tandems are surprisingly common.  An exception is a rare instance where ~20 copies of the Hobo transposon are arrayed in tandem found in a single line. Further, P-element forms small tandem arrays (<8 copies in tandem) in over 50% of lines; because P-element invaded D. melanogaster in the last century, these tandems must be recently formed structures, suggesting that tandem transposable elements can form and expand into larger arrays over short time scales.  These observations suggest that some TE families readily generate the small tandems, which are substrates necessary for the expansion of a larger array. These emerging tandems have the potential to modify chromatin state, concentrate regulatory elements at particular loci,  and potentially regulate the expression of other TEs.  Going forward, the identification of these strains with novel and low frequency tandems provides us the opportunity to assess the phenotypic impacts of tandem TEs and ask questions about the evolution of young satellite arrays.