In this tutorial you will start with bins and process them so that you can understand the metabolisms that each is potentially capable of. These bins were generated from metagenomic reads derived from DNA extracted from 8 surface water samples. The reads were then trimmed and assembled with metaspades. The contigs from the assemblies were then used for binning with CONCOCT and MaxBin using abundance counts derived from mapping and optimized with DAStool. The bins from all 8 samples (links below) were then collated into this narrative.
Group | Narrative |
---|---|
Group A | https://narrative.kbase.us/narrative/186838 |
Group B | https://narrative.kbase.us/narrative/186839 |
Group C | https://narrative.kbase.us/narrative/186842 |
Group D | https://narrative.kbase.us/narrative/186840 |
Group E | https://narrative.kbase.us/narrative/186841 |
Group F | https://narrative.kbase.us/narrative/186844 |
Group G | https://narrative.kbase.us/narrative/186843 |
Group H | https://narrative.kbase.us/narrative/186845 |
Copy this narrative for your group and fill in the answers the questions associated with each app as a team.
What are the default parameters for completion, contamination, and percent identity for dRep in KBase? What version are we using?
How does dRep choose the “best” bin or dRep winner for a cluster?
How many dereplicated MAGs do you have in your dataset?
CONGO_003_metagenome_DASTool.240717 |
CONGO_007_metagenome_DASTool.240722 |
CONGO_017_metagenome_DASTool.240722 |
CONGO_031_metagenome_DASTool.240722 |
CONGO_040_metagenome_DASTool.240722 |
CONGO_063_metagenome_DASTool.240722 |
CONGO_065_metagenome_DASTool.240722 |
CONGO_067_metagenome_DASTool.240722 |
Created Object Name | Type | Description |
---|---|---|
CONGO_metagenomes_dRep.240729_assemblies | AssemblySet | Dereplication results |
What is the highest quality genome? Why is it the highest quality? What is that genomes completeness and contamination?
What is the worst quality genome? Why is it the worst quality? What is that genomes completeness and contamination?
This dataset has bins denoted as Kingdom_Bacteria in the Marker Lineage column. What do you think this means in CheckM?
How should we consider genome quality when looking at the functional annotations of a genome?
GTDB-Tk is a toolkit for assigning taxonomy to microbial genomes. See https://academic.oup.com/bioinformatics/article/36/6/1925/5626182?login=true for details on how GTDB-Tk works, databases used, and output details.
How many phyla for Bacteria and Archaea are represented in the genome set? List them.
You see a bin with the taxa string dBacteria;pActinobacteriota;cAcidimicrobiia;oIMCC26256;fIMCC26256;g;s__. How do you interpret this taxa string? What does it mean that there is no assigned family, genus or species? What does UBA stand for?
What is the highest unnamed taxonomy level in your dataset? How many MAGs have this classification?
What is the full taxonomy string for your best bin?
DRAM is a genome annotation tool, as well as a summarization tool for genome metabolism. See https://academic.oup.com/nar/article/48/16/8883/5884738 for details on how DRAM works, databases used, and output details.
A key feature of DRAM is the module summary. Look at the TCA cycle (aka Krebbs) for all your bins. Estimate how many MAGs have 5 or less steps in the TCA cycle (hint roll over the yellow ones and look at the steps).
Note, for electron transport chain (ETC) enzymes, completion is not about steps, but about how complete an enzyme complex is (i.e. does it have all subunits). What criteria do you think makes an enzyme complete?
For today, let’s assume enzyme complexes need at least 50% of subunits to be functional. Complex IV is used for aerobic respiration. These multi-subunit enzymes can either be high affinity (meaning they operate best in microaerophilic, or under low oxygen, conditions) or low affinity (meaning higher oxygen conditions). How many bins do you think have the potential to respire at low oxygen levels? Note the two enzymes (cbb3 and bd ubiquinol) are two separate complexes that confer similar oxygen respiration abilities (for today’s exercise).
Which genomes seem to use aerobic respiration for energy production? Hint: look at ETC - NADH dehydrogenase and complex IV.
Beyond oxygen, examine Nitrogen, Sulfur, Photosynthesis, and other reductases. Which biogeochemical process (as denoted by enzymes for steps in the process- nitrate to nitrite, not complete denitrification) is most represented in your genomes. What is the evidence for this? NOTE: use caution in interpreting thiosulfate reduction- this has been fixed in newer versions of DRAM. Do not use thiosulfate=sulfite step, as this gene can also catalyze other steps (non specific), use EC 1.8.5.5 (thiosulfate to sulfite).
Do you have the key functional gene for methanogenesis?
Why do you have genes for using acetate then? [acetate+>methane, pt 1]. Hint: look up the EC number here. Do you think it is specific to methanogens?
Hydrogen metabolism (either using or producing hydrogen) can be an important ecosystem process. To see this metabolism you need to go to the Distillate. Download metabolism_summary.xlsx file. Go to Energy Tab. Search column D for hydrogenase. Do you have any evidence for this in your data?
What is a CAZyme? What database does DRAM use for annotating CAZymes?
What CAZyme substrates are well represented in your genomes?
Final synthesis and GROUP DISCUSSION. Choose a single bin or MAG. For your bin you will now analyze a genome’s metabolism. What carbon substrates can the genome likely use? Do you think it is respiratory e.g. using oxygen, nitrate, etc.? If not respiratory, what is evidence for fermentation? What other things are neat to note about this organism (use the metabolism_summary.xlsx or annotations.tsv)? Does it partake in H2 metabolism, CO2 fixation, or have any characteristics listed in the MISC tab (flagella, CRISPR)?
CONGO_003_metagenome_DASTool.240717_bin.001.fasta_DRAM |
CONGO_003_metagenome_DASTool.240717_bin.002.fasta_DRAM |
CONGO_003_metagenome_DASTool.240717_bin.004.fasta_DRAM |
CONGO_003_metagenome_DASTool.240717_bin.007.fasta_DRAM |
Created Object Name | Type | Description |
---|---|---|
CONGO_003_MAGs_dRep_DRAM.240801 | GenomeSet | KButil_Build_GenomeSet |
Created Object Name | Type | Description |
---|---|---|
CONGO_003_metagenome_DASTool.240717_bin.001.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_003_metagenome_DASTool.240717_bin.002.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_003_metagenome_DASTool.240717_bin.004.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_003_metagenome_DASTool.240717_bin.007.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_003_MAGs_dRep_DRAM.240801 | GenomeSet | Taxonomy and taxon_assignment updated with GTDB |
CONGO_007_metagenome_DASTool.240722_bin.001.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.003.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.004.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.005.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.009.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.010.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.011.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.012.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.014.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.016.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.017.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.018.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.019.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.022.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.024.fasta_DRAM |
CONGO_007_metagenome_DASTool.240722_bin.025.fasta_DRAM |
Created Object Name | Type | Description |
---|---|---|
CONGO_007_MAGs_dRep_DRAM.240801 | GenomeSet | KButil_Build_GenomeSet |
Created Object Name | Type | Description |
---|---|---|
CONGO_007_metagenome_DASTool.240722_bin.001.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.003.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.004.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.005.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.009.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.010.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.011.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.012.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.014.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.016.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.017.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.018.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.019.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.022.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.024.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_metagenome_DASTool.240722_bin.025.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_007_MAGs_dRep_DRAM.240801 | GenomeSet | Taxonomy and taxon_assignment updated with GTDB |
CONGO_017_metagenome_DASTool.240722_bin.001.fasta_DRAM |
CONGO_017_metagenome_DASTool.240722_bin.005.fasta_DRAM |
Created Object Name | Type | Description |
---|---|---|
CONGO_017_MAGs_dRep_DRAM.240801 | GenomeSet | KButil_Build_GenomeSet |
Created Object Name | Type | Description |
---|---|---|
CONGO_017_metagenome_DASTool.240722_bin.001.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_017_metagenome_DASTool.240722_bin.005.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_017_MAGs_dRep_DRAM.240801 | GenomeSet | Taxonomy and taxon_assignment updated with GTDB |
CONGO_031_metagenome_DASTool.240722_bin.001.fasta_DRAM |
CONGO_031_metagenome_DASTool.240722_bin.002.fasta_DRAM |
CONGO_031_metagenome_DASTool.240722_bin.003.fasta_DRAM |
CONGO_031_metagenome_DASTool.240722_bin.006.fasta_DRAM |
CONGO_031_metagenome_DASTool.240722_bin.007.fasta_DRAM |
CONGO_031_metagenome_DASTool.240722_bin.009.fasta_DRAM |
CONGO_031_metagenome_DASTool.240722_bin.010.fasta_DRAM |
CONGO_031_metagenome_DASTool.240722_bin.013.fasta_DRAM |
CONGO_031_metagenome_DASTool.240722_bin.015.fasta_DRAM |
Created Object Name | Type | Description |
---|---|---|
CONGO_031_MAGs_dRep_DRAM.240801 | GenomeSet | KButil_Build_GenomeSet |
Created Object Name | Type | Description |
---|---|---|
CONGO_031_metagenome_DASTool.240722_bin.001.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_031_metagenome_DASTool.240722_bin.002.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_031_metagenome_DASTool.240722_bin.003.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_031_metagenome_DASTool.240722_bin.006.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_031_metagenome_DASTool.240722_bin.007.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_031_metagenome_DASTool.240722_bin.009.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_031_metagenome_DASTool.240722_bin.010.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_031_metagenome_DASTool.240722_bin.013.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_031_metagenome_DASTool.240722_bin.015.fasta_DRAM | Genome | Taxonomy unchanged, taxon_assignment added GTDB |
CONGO_031_MAGs_dRep_DRAM.240801 | GenomeSet | Taxonomy and taxon_assignment updated with GTDB |
CONGO_040_metagenome_DASTool.240722_bin.001.fasta_DRAM |
CONGO_040_metagenome_DASTool.240722_bin.002.fasta_DRAM |
CONGO_040_metagenome_DASTool.240722_bin.006.fasta_DRAM |
CONGO_040_metagenome_DASTool.240722_bin.009.fasta_DRAM |
CONGO_040_metagenome_DASTool.240722_bin.010.fasta_DRAM |
CONGO_040_metagenome_DASTool.240722_bin.011.fasta_DRAM |
CONGO_040_metagenome_DASTool.240722_bin.012.fasta_DRAM |
CONGO_040_metagenome_DASTool.240722_bin.015.fasta_DRAM |
CONGO_040_metagenome_DASTool.240722_bin.017.fasta_DRAM |
Created Object Name | Type | Description |
---|---|---|
CONGO_040_MAGs_dRep_DRAM.240801 | GenomeSet | KButil_Build_GenomeSet |
Created Object Name | Type | Description |
---|---|---|
CONGO_040_metagenome_DASTool.240722_bin.010.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_040_metagenome_DASTool.240722_bin.011.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_040_metagenome_DASTool.240722_bin.012.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_040_metagenome_DASTool.240722_bin.015.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_040_metagenome_DASTool.240722_bin.017.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_040_metagenome_DASTool.240722_bin.001.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_040_metagenome_DASTool.240722_bin.002.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_040_metagenome_DASTool.240722_bin.006.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_040_metagenome_DASTool.240722_bin.009.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_040_MAGs_dRep_DRAM.240801 | GenomeSet | Taxonomy and taxon_assignment updated with GTDB |
CONGO_063_metagenome_DASTool.240722_bin.005.fasta_DRAM |
CONGO_063_metagenome_DASTool.240722_bin.011.fasta_DRAM |
CONGO_063_metagenome_DASTool.240722_bin.012.fasta_DRAM |
CONGO_063_metagenome_DASTool.240722_bin.013.fasta_DRAM |
Created Object Name | Type | Description |
---|---|---|
CONGO_063_MAGs_dRep_DRAM.240801 | GenomeSet | KButil_Build_GenomeSet |
Created Object Name | Type | Description |
---|---|---|
CONGO_063_metagenome_DASTool.240722_bin.005.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_063_metagenome_DASTool.240722_bin.011.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_063_metagenome_DASTool.240722_bin.012.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_063_metagenome_DASTool.240722_bin.013.fasta_DRAM | Genome | Taxonomy and taxon_assignment updated with GTDB |
CONGO_063_MAGs_dRep_DRAM.240801 | GenomeSet | Taxonomy and taxon_assignment updated with GTDB |