Annotate domains in every Genome within a GenomeSet using protein domains from widely used domain libraries.
This App uses the same process as the Annotate Domains in a Genome App, but does so on all Genomes within a GenomeSet or a SpeciesTree.
This App identifies protein domains from widely used domain libraries. It requires a Genome as input, which must already have annotated protein-encoding genes (e.g., those identified using the Annotate Microbial Genome or Annotate Microbial Assembly Apps).
Each Genome is annotated with domains from all available domain libraries (for a complete list, see the documentation linked above). This may take several hours per genome, depending on the genome size.
Team members who developed & deployed this App in KBase: Dylan Chivian. Annotate Domains in a Genome was developed by John-Marc Chandonia, Roman Sutormin, and Pavel Novichkov. For questions, please contact us.
Related Publications
- Altschul SF, Madden TL, Sch ffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389 3402. doi:10.1093/nar/25.17.3389 , https://academic.oup.com/nar/article/25/17/3389/1061651
- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10: 421. doi:10.1186/1471-2105-10-421 , https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-10-421
- Eddy SR. Accelerated Profile HMM Searches. PLOS Computational Biology. 2011;7: e1002195. doi:10.1371/journal.pcbi.1002195 , https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002195
- Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44: D279 D285. doi:10.1093/nar/gkv1344 , https://academic.oup.com/nar/article/44/D1/D279/2503120
- Haft DH, Selengut JD, Richter RA, Harkins D, Basu MK, Beck E. TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res. 2013;41: D387 D395. doi:10.1093/nar/gks1234 , https://academic.oup.com/nar/article/41/D1/D387/1070451
- Letunic I, Bork P. 20 years of the SMART protein domain annotation resource. Nucleic Acids Res. 2018;46: D493 D496. doi:10.1093/nar/gkx922 , https://academic.oup.com/nar/article/46/D1/D493/4429069
- Letunic I, Doerks T, Bork P. SMART: recent updates, new developments and status in 2015. Nucleic Acids Res. 2015;43: D257-260. doi:10.1093/nar/gku949 , https://academic.oup.com/nar/article/43/D1/D257/2439521
- Marchler-Bauer A, Bo Y, Han L, He J, Lanczycki CJ, Lu S, et al. CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Res. 2017;45: D200 D203. doi:10.1093/nar/gkw1129 , https://academic.oup.com/nar/article/45/D1/D200/2605748
- Selengut JD, Haft DH, Davidsen T, Ganapathy A, Gwinn-Giglio M, Nelson WC, et al. TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes. Nucleic Acids Res. 2007;35: D260-264. doi:10.1093/nar/gkl1043 , https://academic.oup.com/nar/article/35/suppl_1/D260/1088023
- Tatusov RL, Koonin EV, Lipman DJ. A Genomic Perspective on Protein Families. Science. 1997;278: 631 637. doi:10.1126/science.278.5338.631 , https://www.ncbi.nlm.nih.gov/pubmed/9381173
App Specification:
https://github.com/kbaseapps/kb_phylogenomics/tree/aed8564fcf4c6e8a3e94f6546715496e6fffbd84/ui/narrative/methods/run_DomainAnnotation_SetsModule Commit: aed8564fcf4c6e8a3e94f6546715496e6fffbd84