Search for matches to HMMs of MicroTrait environmental bioelement cycling families using HMMER 3
This method scans protein sequences found in Genomes and Annotated Metagenome Assemblies (AMAs) using a set of Hidden Markov Models (HMMs) from the environmental bioelement MicroTrait collection. It uses HMMER software.
Search with HMMs of MicroTrait Bioelement families profiles collections of genes, genomes, and/or annotated metagenome assemblies for MicroTrait functions and optionally outputs FeatureSet collections for each of the requested gene families. It uses gene-family-derived HMMs from the MicroTrait collection, which include model-specific lower confidence thresholds to improve the accuracy of the functional classification. The user can run with the entire collection, just those from a given Bioelement category, or specify individual gene families with which to search. In this last mode, FeatureSet objects are produced that can be used in additional KBase phylogenomic Apps, such as Build Gene Tree. Hits by each gene family to genes in the target set are also shown in the report.
Tool and Data Sources:
HMMER v3.3.2 is installed from http://hmmer.org
MicroTrait v1.0 HMMs are installed from https://github.com/ukaraoz/microtrait-hmm/tree/master/data.kb_hmmer/hmm
Configuration:
Targets Objects: The Targets Objects may be a FeatureSet of genes, a Genome, a GenomeSet, a SpeciesTree, or an Annotated Metagenome Assembly (AMA). A HMMER search database will be automatically generated from the Targets Object.
Output FeatureSet basename: This is the basename for the objects that will contain the set of genes that are both hit and pass confidence thresholds for each model.
Other Parameters: See "Parameters" section below.
Output:
Output Object: Gene hits are captured in a FeatureSet output object. If there are additional user-defined thresholds, those are filtered out and do not appear in the object, even if they are shown in the output table. The Output object name is used as a basename to which the HMM name is prepended.
Output HTML Profile: A raw count or heatmap of the number of genes hit from each gene family (column) with each genome or annotated metagenome assembly (row). Each cell in the profile offers a roll-over of the number of hits and the gene IDs of those hits. If the input targets is a SpeciesTree, the rows are ordered by their order in a ladderized view of that tree (available from the "View Tree" App).
Output HTML Table: The tab-delimited hit table is HTML formatted and additionally shows the region of the hit sequence (as there is no query sequence) covered by the HMMER alignment. Hits that are above the e-value threshold but below other thresholds and are not included in the FeatureSet output object are shown in gray, with the attributes that were below the threshold in red. A separate table is made for each HMM.
Downloadable files: HMMER output hit table is available for download. These are not altered from the direct output from the HMMER run. The text output is generated for each HMM.
Team members who implemented App in KBase: Dylan Chivian and Sean Jungbluth. For questions, please contact us.
Please cite:
- Karaoz U, Brodie EL. microTrait: A Toolset for a Trait-Based Representation of Microbial Genomes. Front Bioinform. 2022 Jul 22;2:918853. doi: 10.3389/fbinf.2022.918853
- Eddy SR. Accelerated Profile HMM Searches. PLOS Computational Biology. 2011;7: e1002195. doi:10.1371/journal.pcbi.1002195
- Chivian D, Jungbluth SP, Dehal PS, Wood-Charlson EM, Canon RS, Allen BH, Clark MM, Gu T, Land ML, Price GA, Riehl WJ, Sneddon MW, Sutormin R, Zhang Q, Cottingham RW, Henry CS, Arkin AP. Metagenome-assembled genome extraction and analysis from microbiomes using KBase. Nat Protoc. 2023 Jan;18(1):208-238. doi: 10.1038/s41596-022-00747-x
Related Publications
- Karaoz U, Brodie EL. microTrait: A Toolset for a Trait-Based Representation of Microbial Genomes. Front Bioinform. 2022 Jul 22;2:918853. doi: 10.3389/fbinf.2022.918853 , https://www.frontiersin.org/articles/10.3389/fbinf.2022.918853/full
- Eddy SR. Accelerated Profile HMM Searches. PLOS Computational Biology. 2011;7: e1002195. doi:10.1371/journal.pcbi.1002195 , https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002195
- HMMER v3.3.2 source: , http://HMMER.ORG
- Chivian D, Jungbluth SP, Dehal PS, Wood-Charlson EM, Canon RS, Allen BH, Clark MM, Gu T, Land ML, Price GA, Riehl WJ, Sneddon MW, Sutormin R, Zhang Q, Cottingham RW, Henry CS, Arkin AP. Metagenome-assembled genome extraction and analysis from microbiomes using KBase. Nat Protoc. 2023 Jan;18(1):208-238. doi: 10.1038/s41596-022-00747-x , https://www.nature.com/articles/s41596-022-00747-x
App Specification:
https://github.com/kbaseapps/kb_hmmer/tree/6c338791492e2980534a08a1606d2d7137884759/ui/narrative/methods/HMMER_MT_Bioelement_SearchModule Commit: 6c338791492e2980534a08a1606d2d7137884759