Annotate or re-annotate genomes/assemblies using RASTtk (Rapid Annotations using Subsystems Technology toolkit).
This KBase annotation App (Bulk Annotate Genomes/Assemblies uses components from the RAST (Rapid Annotations using Subsystems Technology) toolkit [1,2,3] to annotate prokaryotic genomes, to update the annotations of genomes, or to perform computations on a set of genomes so that they are consistent.
The release versions of the RASTtk component services used in this app are:
- kb_seed: tag 20200922
- kmer_annotation_figfam: tag 20200922
- genome_annotation: tag 20200922
The Bulk Annotate Genomes/Assemblies App takes genomes and/or assemblies as inputs and allows users to annotate or re-annotate the genomes and/or assemblies. This will make the annotations consistent with other KBase genomes and prepare the genomes for further analysis by other KBase Apps, especially the Metabolic Modeling Apps. A Genome object can be generated by uploading a GenBank file, importing a GenBank file from NCBI via FTP, retrieving a Genome-typed object from KBase, or using the output of the Annotate Microbial Assembly App.
The newly annotated genomes will be included in a GenomeSet object with the user specified GenomeSet name, with each individual RAST-annotated genome named by its corresponding input genome/assembly name prefixed with the GenomeSet name.
Team members who developed & deployed algorithm in KBase:
Thomas Brettin, James Davis, Terry Disz, Robert Edwards, Chris Henry, Gary Olsen, Robert Olson, Ross Overbeek, Bruce Parrello, Gordon Pusch, Roman Sutormin, and Fangfang Xia. For questions, please contact us.
The authors of RAST request that if you use the results of this annotation in your work, please cite the first three listed publications:
Related Publications
- [1] Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75. doi:10.1186/1471-2164-9-75 , https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-9-75
- [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226 , https://academic.oup.com/nar/article/42/D1/D206/1062536
- [3] Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep. 2015;5. doi:10.1038/srep08365 , https://www.nature.com/articles/srep08365
- [4] Kent WJ. BLAT The BLAST-Like Alignment Tool. Genome Res. 2002;12: 656 664. doi:10.1101/gr.229202 , https://genome.cshlp.org/content/12/4/656
- [5] Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389-3402. doi:10.1093/nar/25.17.3389 ,
- [6] Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25: 955 964. , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC146525/
- [7] Cobucci-Ponzano B, Rossi M, Moracci M. Translational recoding in archaea. Extremophiles. 2012;16: 793 803. doi:10.1007/s00792-012-0482-8 , https://www.ncbi.nlm.nih.gov/pubmed/23015064
- [8] Meyer F, Overbeek R, Rodriguez A. FIGfams: yet another set of protein families. Nucleic Acids Res. 2009;37 6643-54. doi:10.1093/nar/gkp698. , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2777423/
- [9] van Belkum A, Sluijuter M, de Groot R, Verbrugh H, Hermans PW. Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1996;34: 1176 1179. , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC228977/
- [10] Croucher NJ, Vernikos GS, Parkhill J, Bentley SD. Identification, variation and transcription of pneumococcal repeat sequences. BMC Genomics. 2011;12: 120. doi:10.1186/1471-2164-12-120 , https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-12-120
- [11] Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. doi:10.1186/1471-2105-11-119 , https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-11-119
- [12] Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007;23: 673 679. doi:10.1093/bioinformatics/btm009 , https://academic.oup.com/bioinformatics/article/23/6/673/419055
- [13] Akhter S, Aziz RK, Edwards RA. PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies. Nucleic Acids Res. 2012;40: e126. doi:10.1093/nar/gks406 , https://academic.oup.com/nar/article/40/16/e126/1027055
App Specification:
https://github.com/kbaseapps/RAST_SDK/tree/7171090d87fccc8b7ecf1a1d02398995dcc2dd45/ui/narrative/methods/bulk_annotate_genomes_assembliesModule Commit: 7171090d87fccc8b7ecf1a1d02398995dcc2dd45