Whole Genome Annotation
We have developed two pipelines that provide automated structural and functional annotation of prokaryotic genomes. Both pipelines use Prodigal for gene finding, followed by searches on the predicted protein sequences using multiple tools, and finally annotations assigned based on an evidence hierarchy such that the highest ranking evidence is used to provide annotation for each protein.
- The IGS Prokaryotic Annotation Pipeline loads search results and annotations into a relational database for access with the annotation visualization and curation tool Manatee. Manatee provides an interface for viewing annotation results and associated evidence, adding/modifying annotations, and downloading annotation information in a variety of formats.
- To run this pipeline independently, use the CloVR Microbe resource
- To get analysis results via fee-for-service contact us for more information.
- The Genomic Annotation Logic and Execution system (GALES) is a more streamlined and customizable pipeline that produces results in flat files in a variety of formats.
- To run this pipeline independently via Docker, visit the GALES GitHub page
- To get analysis results via fee-for-service contact us for more information.
Comparative Genomics
The Analysis Engine offers two different methods for doing comparative analysis of prokaryotic genomes. Both methods employ the tool Sybil to visualize results.
- Jaccard clustering of filtered protein bi-directional best blast matches to build ortholog clusters. This method can be carried out on fairly diverse genomes.
- Mugsy whole genome alignment. This method requires that the genomes in the analysis be fairly closely related to each other.
- To run this pipeline independently, use the CloVR Comparative resource
- This option is only offered via fee-for-service. Contact us for more information.
Transcriptome Analysis
The prokaryotic transcriptome analysis pipeline at IGS is a comprehensive resource that provides several of the most common transcriptomic analysis tasks. The first step in the analysis is alignment of RNA-Seq reads to a reference genome resulting in BAM files. Using the BAM files for each sample, the RPKM (reads per kilobase per million mapped reads) values for each gene in the samples is calculated based on the gene annotation contained in the reference genome. In order to identify genes that have increased or decreased expression in one set of samples vs. another, differential expression analysis is performed using DEseq and EdgeR on the read counts generated using HTSeq. The results for the sample comparisons are filtered and the output stored in tab-delimited text files as well as Excel spreadsheets. Read more.
Genome Assembly
To run this independently use the CloVR Microbe resource. Assembly is included when using our fee-for-service sequencing center to generate genome sequence. Contact us for more information.
Custom Analysis
If you have analysis needs beyond what is outlined above, please check out our services page and contact us to discuss the needs of your project. We can provide custom analyses (for both prokaryotes and eukaryotes) on a fee-for-service basis.