Date of Award


Document Type

Undergraduate Thesis

Degree Name



Biomedical Sciences

Faculty Mentor

Glen M. Borchert


Due to rapid advances in sequencing technology, it is becoming increasingly easier to assemble unknown genomes from millions of short sequencing reads of nucleotides taken from the full genomic sequence (Hernandez, 2008). In this study, we used various computational programs to align reads from unsequenced strains of the bacteria Staphylococcus epidermidis and Staphylococcus hominis into a hitherto undefined, single contiguous genomic sequence. We used SPAdes to assist with template free assembly (Bankevich, 2012), BLAST to identify a suitable reference genome from closely related species (Madden, 2013), Bowtie2 to align our reads to the reference genome, SAMtools to sort and organize files (Li, 2009), RGAAT to incorporate variants into our reads and update our final genome (Liu, 2018), and Mauve to rapidly align the reads and provide a visual representation of the final genome (Darling, 2011). In all, we assembled ten bacterial genomes which have never been sequenced and assembled previously. Importantly, we have developed and validated a high throughput computational pipeline capable of quickly assembling full genomes from millions of individual reads. Excitingly this protocol can continue to be used as needed to sequence and assemble more bacterial genomes to provide a genetic basis for studying bacterial characteristics.