PgmNr D1397: Annotation and transcription start site discovery on the dot chromosome of Drosophila ficusphila and Drosophila biarmipes.

Authors:
R. E. Boody; C. M. Brown; S. A. Liber; J. L. Sanford


Institutes
Ohio Northern University, Ada, OH.


Keyword: genome evolution

Abstract:

The Muller F element, or chromosome 4 in Drosophila, is commonly referred to as the dot chromosome. This chromosome is unique among other Drosophila autosomes as it is highly heterochromatic and contains a high density of repetitive sequence. Remarkably, comparative genomic analyses of the dot chromosome show similar proportions of active genes in comparison to the euchromatic control regions. The Genomics Education Partnership (GEP) at Washington University, St. Louis, aims to elucidate the evolutionary mechanisms by which this uncommon gene expression occurs. Undergraduate students across the country work with the GEP to manually annotate genes, develop gene models, and search for transcription start sites in newly sequenced Drosophila genomes.  The present work focuses on the annotation of contigs 32, 34, 36, 37, and 42 of the Drosophila ficusphila dot chromosome assembly and the discovery of transcription start sites (TSS) in contig 62 of the Drosophila biarmipes dot chromosome assembly. Annotation of genes relies on standard bioinformatic tools, including NCBI BLAST, in silico gene prediction programs, and a mirror to the UCSC Genome Browser to manually curate the gene models.  These models include the determination of start and stop coordinates for all the exons of all the genes present on the contig.  The process of TSS discovery depends on similar bioinformatic tools, using the UCSC Genome Browser in conjunction with Celniker data to determine the presence, location, and type of the ortholog in D. melanogaster. Ultimately, this data is used to determine the presence, location, and type of promoter in the target species. Data obtained from the annotation of contig 32 in the D.ficusphila dot assembly revealed 6 genes: Eph, Gat, mav, Ekar, CG11155, and Slip1. Genes gw, CG11360, and myo were found on contig 34. Contig 36 contains the genes ey, toy, and bt, which continues onto contig 37.  The only remaining gene on contig 37 is MED26.  Lastly, annotation of contig 42 showed the presence of 2 genes, unc and mGluR.  TSS discovery in contig 62 of the D. biarmipes dot assembly included the elucidation of the transcription start site of the sv gene.  Future work will further develop understanding of the TSS discovery via motif hunting for common promoter elements to determine potential differences in motif distribution along the dot chromosome in comparison to the heterochromatic control regions.  Analysis of gene structure and TSS annotation in the dot and 3L control chromosomes across multiple Drosophila species, including the genes in this study, will aid in highlighting the mechanisms driving gene expression in the highly heterochromatic dot chromosome.