Annotate or re-annotate genome/assembly using RASTtk (Rapid Annotations using Subsystems Technology toolkit).
This KBase annotation App (Annotate Genome/Assembly) uses components from the RAST (Rapid Annotations using Subsystems Technology) toolkit [1,2,3] to annotate a prokaryotic genome or to update the annotations of a genome.
The release versions of the RASTtk component services used in this app are:
- kb_seed: tag 20200922
- kmer_annotation_figfam: tag 20200922
- genome_annotation: tag 20200922
The Annotate Genome/Assembly App takes a KBase Genome or Assembly object as input and allows users to annotate or re-annotate the assembly/genome. This will make the annotations consistent with other KBase assemblies/genomes and prepare the them for further analysis by other KBase Apps, especially the Metabolic Modeling Apps.
The Results
- The Objects section has a table of all the RAST-annoated genome object that was created by this App. Click on the name of the data object to open a data viewer cell (below the currently selected cell).
- The Reports section has a report on the annotated genome object with regard to functional roles, gene counts of the function and subsystem information if any.
- The Summary section give details about the coding and noncoding features that were created and the average protein length.
- The Links section gives a link to the report that is presented in the Reports section.
GUI Output
The GUI output currently consists of three tabs. The "Overview" tab provides basic information on the annotation job, the "Browse Features" tab allows the user to scroll through the features that were called, and the "Browse Contigs" tab provides information on the contigs in the genome. Users can sort on the various types of features. Note that some features will overlap (e.g., "prophage" and "CDS").
Additional Information
For more information on the steps of the default RASTtk pipeline please refer to our publication on this (publication forthcoming). For more detailed tutorial information and to explore the additional functionality of RASTtk not currently available in the Narrative interface, please refer to http://tutorial.theseed.org.
Team members who developed & deployed algorithm in KBase: Thomas Brettin, James Davis, Terry Disz, Robert Edwards, Chris Henry, Gary Olsen, Robert Olson, Ross Overbeek, Bruce Parrello, Gordon Pusch, Roman Sutormin, and Fangfang Xia. For questions, please contact us.
The authors of RAST request that if you use the results of this annotation in your work, please cite the first three listed publications:
Related Publications
- [1] Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75. doi:10.1186/1471-2164-9-75 , https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-9-75
- [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226 , https://academic.oup.com/nar/article/42/D1/D206/1062536
- [3] Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep. 2015;5. doi:10.1038/srep08365 , https://www.nature.com/articles/srep08365
- [4] Kent WJ. BLAT The BLAST-Like Alignment Tool. Genome Res. 2002;12: 656 664. doi:10.1101/gr.229202 , https://genome.cshlp.org/content/12/4/656
- [5] Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389-3402. doi:10.1093/nar/25.17.3389 ,
- [6] Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25: 955 964. , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC146525/
- [7] Cobucci-Ponzano B, Rossi M, Moracci M. Translational recoding in archaea. Extremophiles. 2012;16: 793 803. doi:10.1007/s00792-012-0482-8 , https://www.ncbi.nlm.nih.gov/pubmed/23015064
- [8] Meyer F, Overbeek R, Rodriguez A. FIGfams: yet another set of protein families. Nucleic Acids Res. 2009;37 6643-54. doi:10.1093/nar/gkp698. , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2777423/
- [9] van Belkum A, Sluijuter M, de Groot R, Verbrugh H, Hermans PW. Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1996;34: 1176 1179. , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC228977/
- [10] Croucher NJ, Vernikos GS, Parkhill J, Bentley SD. Identification, variation and transcription of pneumococcal repeat sequences. BMC Genomics. 2011;12: 120. doi:10.1186/1471-2164-12-120 , https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-12-120
- [11] Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. doi:10.1186/1471-2105-11-119 , https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-11-119
- [12] Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007;23: 673 679. doi:10.1093/bioinformatics/btm009 , https://academic.oup.com/bioinformatics/article/23/6/673/419055
- [13] Akhter S, Aziz RK, Edwards RA. PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies. Nucleic Acids Res. 2012;40: e126. doi:10.1093/nar/gks406 , https://academic.oup.com/nar/article/40/16/e126/1027055
App Specification:
https://github.com/kbaseapps/RAST_SDK/tree/7171090d87fccc8b7ecf1a1d02398995dcc2dd45/ui/narrative/methods/annotate_genome_assemblyModule Commit: 7171090d87fccc8b7ecf1a1d02398995dcc2dd45