PgmNr E8025: Research based learning in bioinformatics using yeast experimental evolution.

Authors:
Laurie Stevison 1 ; Molly Burke 2


Institutes
1) Auburn University; 2) Oregon State University.


Abstract:

I recently taught an Introductory Bioinformatics course for advanced undergraduate and graduate students using the text ‘Practical Computing for Biologists’. As a new course in an active learning classroom, I designed the course to follow a research study from start to finish to simulate real world challenges of data analysis and dissemination. I searched NCBI for a public dataset using yeast, due to their small genome size, and found an evolve and resequence experiment based on multiple genome sequences. Throughout the semester I corresponded with the first author, Dr. Burke, who provided advice and access to datasets. Early in the semester, students were each assigned an accession of a yeast genome. Then, in concert with teaching genomics and the GATK best practices workflow in class, the students analyzed datasets within research teams to understand the challenges of data analysis. I split the analysis into steps to scaffold the analysis to help the students hone their bioinformatics skills. In class, they learned several bioinformatic tools, and were required to use these tools to assess data quality and submit group reports. The reports reinforced their understanding of the tools and allowed them to see how the data quality improved at each step. While they were required to examine data in multiple ways, they were asked only to report findings relevant to their conclusions. After completing data analysis, I transitioned the class to statistical analysis. For this, I combined their samples into a larger class dataset moving forward. Each group performed a different statistical analysis done in the original paper. The students were introduced to Github earlier in the class, which has built-in educational assessment tools. Although this class is still ongoing, for their statistical analysis, they will make a ‘readme’ file and upload the associated scripts and graphs to a data repository to teach them that research should be repeatable. Dr. Burke will also be available to students in case they have any questions for the author. Then, they will read the manuscript to compare/contrast their findings to the original study. Since the class only analyzed 5 of 12 replicate populations, this comparison is meant to reinforce the large effort of one scientific paper. After a few lectures on effective science graphics and communication, they will present their work in an auditorium open to the biology department. The final step will be peer assessment where they evaluate each other’s presentations and Github pages. While the students were not required to have prerequisites besides genetics, a statistics prerequisite was highly recommended. Future iterations may require some expertise in command line prior to taking the course as proficiency in this area is needed to handle the required workload.