Generated April 8, 2022

Data Supporting the manuscript, "Plant growth promoting activity of bacteria isolated from Asian rice are plant subspecies dependent"

Nasim Maghboli Balasjin, James S. Maki, Michael R. Schläppi and Christopher W. Marshall

Marquette University, Biological Sciences Department, Milwaukee, Wisconsin, USA

Abstract

Asian rice (Oryza sativa L.) is one of the most important crops because it is a staple food for almost half of the world’s population. O. sativa has two subspecies, JAPONICA and INDICA that differ by morphological, physiological and genetic characteristics. To have production of O. sativa keep pace with a growing world population, it is anticipated that the use of fertilizers will also need to increase. This increased fertilizer use may cause environmental damage through runoff impacts, but an alternative strategy to increase crop yield is the use of plant growth promoting bacteria. Thousands of microbial species can exist in association with plant roots and shoots, and some are critical to the plant’s survival. We isolated 140 bacteria from O. sativa and investigated whether JAPONICA and INDICA rice subspecies were positively influenced by these isolates. The bacterial isolates were screened for their ability to solubilize phosphate, a known plant growth promoting characteristic, and 25 isolates were selected for further analysis. These 25 phosphate solubilizing isolates were also able to produce other potentially growth-promoting factors including lipases, cellulases, proteases, siderophores, indoleacetic acid, gibberellic acid and 1-aminocyclopropane-1-carboxylic acid deaminase. Five of the most promising bacterial isolates were chosen for whole genome sequencing. Four of these bacteria, isolates related to Pseudomonas mosselii, Microvirga sp., Paenibacillus rigui and Paenibacillus graminis, improved root and shoot growth, root to shoot ratio, and increased root dry weights of JAPONICA plants but had no effect on growth and development of INDICA plants. This indicates that while bacteria have several known plant growth promoting functions, their effects on growth parameters can be plant subspecies dependent and suggest close relationships between plants and their microbial partners.

Sample 163 processing

Import a FASTQ/SRA file into your Narrative as a Reads data object
This app completed without errors in 2m 41s.
Objects
Created Object Name Type Description
163_S67_R1_001.fastq_reads PairedEndLibrary Imported Reads
Links
A quality control application for high throughput sequence data.
This app completed without errors in 2m 34s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • 163_S67_R1_001.fastq_reads_52526_10_1.fwd_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
  • 163_S67_R1_001.fastq_reads_52526_10_1.rev_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
Assemble reads using the SPAdes assembler.
This app completed without errors in 19m 44s.
Objects
Created Object Name Type Description
SPAdes163.contigs Assembly Assembled contigs
Summary
Assembly saved to: nasimaghbooli:narrative_1575949707349/SPAdes163.contigs Assembled into 89 contigs. Avg Length: 82067.85393258427 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 42 -- 509.0 to 39255.5 bp 16 -- 39255.5 to 78002.0 bp 7 -- 78002.0 to 116748.5 bp 5 -- 116748.5 to 155495.0 bp 6 -- 155495.0 to 194241.5 bp 5 -- 194241.5 to 232988.0 bp 3 -- 232988.0 to 271734.5 bp 1 -- 271734.5 to 310481.0 bp 2 -- 310481.0 to 349227.5 bp 2 -- 349227.5 to 387974.0 bp
Links
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.
This app completed without errors in 13m 42s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM
Annotate a bacterial or archaeal assembly using components from the RAST (Rapid Annotations using Subsystems Technology) toolkit (RASTtk).
This app completed without errors in 6m 35s.
Objects
Created Object Name Type Description
163_genome_annotate Genome Annotated genome
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 89 contigs containing 7304039 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 7141 new features were called, of which 285 are non-coding.
Output genome has the following feature types:
	Coding gene                     6856 
	Non-coding crispr_array            3 
	Non-coding crispr_repeat          68 
	Non-coding crispr_spacer          65 
	Non-coding repeat                104 
	Non-coding rna                    45 
Overall, the genes have 2526 distinct functions. 
The genes include 2793 genes with a SEED annotation ontology across 1263 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Output from Annotate Microbial Assembly
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/52526
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB)
This app completed without errors in 27m 30s.
Links
Add one or more genomes to a KBase species tree.
This app completed without errors in 3m 17s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • nasim_163_tree.newick
  • nasim_163_tree-labels.newick
  • nasim_163_tree.png
  • nasim_163_tree.pdf
Allows users to compute fast whole-genome Average Nucleotide Identity (ANI) estimation.
This app completed without errors in 2m 48s.
Links
Annotate Assembly and Re-annotate Genomes with Prokka annotation pipeline.
This app completed without errors in 4m 22s.
Objects
Created Object Name Type Description
163_Nasim Genome Annotated genome
Summary
Genome Ref:52526/109/1 Number of features sent into prokka:6856 New functions found:4083 Ontology terms found:1702
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • function_report - Annotation report generated by kb_prokka
  • ontology_report - Annotation report generated by kb_prokka
Annotate a Genome object with protein domains from widely used domain libraries.
This app completed without errors in 3h 14m 28s.
Objects
Created Object Name Type Description
163_Nasim DomainAnnotation Domain Annotations
Summary
Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/19/1 Running domain search against library 2959/18/1 Running domain search against library 2959/24/1 Running domain search against library 2959/25/1 Running domain search against library 2959/23/1 Running domain search against library 2959/7/7 Running domain search against library 2959/20/1 Running domain search against library 2959/17/1 Running domain search against library 2959/21/1 Running domain search against library 2959/22/1
Output from Annotate Domains in a Genome
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/52526

Sample 172 processing

Import a FASTQ/SRA file into your Narrative as a Reads data object
This app completed without errors in 1m 57s.
Objects
Created Object Name Type Description
172_S70_R1_001.fastq_reads PairedEndLibrary Imported Reads
Links
A quality control application for high throughput sequence data.
This app completed without errors in 2m 7s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • 172_S70_R1_001.fastq_reads_52526_2_1.rev_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
  • 172_S70_R1_001.fastq_reads_52526_2_1.fwd_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
Assemble reads using the SPAdes assembler.
This app completed without errors in 35m 22s.
Objects
Created Object Name Type Description
172_SPAdes.contigs Assembly Assembled contigs
Summary
Assembly saved to: nasimaghbooli:narrative_1575949707349/172_SPAdes.contigs Assembled into 104 contigs. Avg Length: 68862.5480769 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 65 -- 513.0 to 58915.4 bp 23 -- 58915.4 to 117317.8 bp 6 -- 117317.8 to 175720.2 bp 2 -- 175720.2 to 234122.6 bp 2 -- 234122.6 to 292525.0 bp 3 -- 292525.0 to 350927.4 bp 2 -- 350927.4 to 409329.8 bp 0 -- 409329.8 to 467732.2 bp 0 -- 467732.2 to 526134.6 bp 1 -- 526134.6 to 584537.0 bp
Links
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.
This app completed without errors in 8m 8s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB)
This app completed without errors in 31m 29s.
Links
Annotate a bacterial or archaeal assembly using components from the RAST (Rapid Annotations using Subsystems Technology) toolkit (RASTtk).
This app completed without errors in 6m 20s.
Objects
Created Object Name Type Description
172_genome_annotate Genome Annotated genome
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 104 contigs containing 7161705 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 7084 new features were called, of which 122 are non-coding.
Output genome has the following feature types:
	Coding gene                     6962 
	Non-coding repeat                 63 
	Non-coding rna                    59 
Overall, the genes have 2715 distinct functions. 
The genes include 2614 genes with a SEED annotation ontology across 1255 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Output from Annotate Microbial Assembly
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/52526
Add one or more genomes to a KBase species tree.
This app completed without errors in 3m 13s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • nasim_172.newick
  • nasim_172-labels.newick
  • nasim_172.png
  • nasim_172.pdf
Annotate domains in every Genome within a GenomeSet using protein domains from widely used domain libraries.
This app completed without errors in 1d 8h 59m 24s.
Summary
Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7

Sample n00170 processing

Import a FASTQ/SRA file into your Narrative as a Reads data object
This app completed without errors in 2m 6s.
Objects
Created Object Name Type Description
170_S69_R1_001.fastq_reads PairedEndLibrary Imported Reads
Links
A quality control application for high throughput sequence data.
This app completed without errors in 2m 3s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • 132_S66_001.fastq_reads_52526_8_1.fwd_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
  • 132_S66_001.fastq_reads_52526_8_1.rev_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
Assemble reads using the SPAdes assembler.
This app completed without errors in 23m 5s.
Objects
Created Object Name Type Description
170_SPAdes.contigs Assembly Assembled contigs
Summary
Assembly saved to: nasimaghbooli:narrative_1575949707349/170_SPAdes.contigs Assembled into 46 contigs. Avg Length: 99737.5434783 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 27 -- 522.0 to 75007.5 bp 10 -- 75007.5 to 149493.0 bp 3 -- 149493.0 to 223978.5 bp 2 -- 223978.5 to 298464.0 bp 2 -- 298464.0 to 372949.5 bp 0 -- 372949.5 to 447435.0 bp 1 -- 447435.0 to 521920.5 bp 0 -- 521920.5 to 596406.0 bp 0 -- 596406.0 to 670891.5 bp 1 -- 670891.5 to 745377.0 bp
Links
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.
This app completed without errors in 7m 40s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB)
This app completed without errors in 29m 50s.
Links
Annotate a bacterial or archaeal assembly using components from the RAST (Rapid Annotations using Subsystems Technology) toolkit (RASTtk).
This app completed without errors in 5m 8s.
Objects
Created Object Name Type Description
170_genome_annotate Genome Annotated genome
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 46 contigs containing 4587927 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 4652 new features were called, of which 94 are non-coding.
Output genome has the following feature types:
	Coding gene                     4558 
	Non-coding repeat                 36 
	Non-coding rna                    58 
Overall, the genes have 2385 distinct functions. 
The genes include 2085 genes with a SEED annotation ontology across 1224 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Output from Annotate Microbial Assembly
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/52526
Add one or more genomes to a KBase species tree.
This app completed without errors in 3m 25s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • nasim_170.newick
  • nasim_170-labels.newick
  • nasim_170.png
  • nasim_170.pdf
Allows users to compute fast whole-genome Average Nucleotide Identity (ANI) estimation.
This app completed without errors in 4m 11s.
Links

Sample n00167 processing

Import a FASTQ/SRA file into your Narrative as a Reads data object
This app completed without errors in 2m 6s.
Objects
Created Object Name Type Description
167_S68_R1_001.fastq_reads PairedEndLibrary Imported Reads
Links
A quality control application for high throughput sequence data.
This app completed without errors in 2m 2s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • 167_S68_R1_001.fastq_reads_52526_6_1.rev_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
  • 167_S68_R1_001.fastq_reads_52526_6_1.fwd_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
Assemble reads using the SPAdes assembler.
This app completed without errors in 13m 51s.
Objects
Created Object Name Type Description
167_SPAdes.contigs Assembly Assembled contigs
Summary
Assembly saved to: nasimaghbooli:narrative_1575949707349/167_SPAdes.contigs Assembled into 153 contigs. Avg Length: 42156.4444444 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 98 -- 507.0 to 37097.8 bp 26 -- 37097.8 to 73688.6 bp 12 -- 73688.6 to 110279.4 bp 6 -- 110279.4 to 146870.2 bp 4 -- 146870.2 to 183461.0 bp 3 -- 183461.0 to 220051.8 bp 1 -- 220051.8 to 256642.6 bp 1 -- 256642.6 to 293233.4 bp 0 -- 293233.4 to 329824.2 bp 2 -- 329824.2 to 366415.0 bp
Links
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.
This app completed without errors in 12m 24s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM
Annotate a bacterial or archaeal assembly using components from the RAST (Rapid Annotations using Subsystems Technology) toolkit (RASTtk).
This app completed without errors in 15m 23s.
Objects
Created Object Name Type Description
167_genome_annotate Genome Annotated genome
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 153 contigs containing 6449936 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 6876 new features were called, of which 117 are non-coding.
Output genome has the following feature types:
	Coding gene                     6759 
	Non-coding crispr_array            1 
	Non-coding crispr_repeat           4 
	Non-coding crispr_spacer           3 
	Non-coding repeat                 58 
	Non-coding rna                    51 
Overall, the genes have 2692 distinct functions. 
The genes include 3133 genes with a SEED annotation ontology across 1333 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Output from Annotate Microbial Assembly
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/52526
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB)
This app completed without errors in 30m 12s.
Links
Add one or more genomes to a KBase species tree.
This app completed without errors in 4m 7s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • nasim_167.newick
  • nasim_167-labels.newick
  • nasim_167.png
  • nasim_167.pdf
Allows users to compute fast whole-genome Average Nucleotide Identity (ANI) estimation.
This app completed without errors in 6m 11s.
Links

Sample n00132 processing

Import a FASTQ/SRA file into your Narrative as a Reads data object
This app completed without errors in 2m 6s.
Objects
Created Object Name Type Description
132_S66_001.fastq_reads PairedEndLibrary Imported Reads
Links
A quality control application for high throughput sequence data.
This app completed without errors in 2m 0s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • 132_S66_001.fastq_reads_52526_8_1.fwd_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
  • 132_S66_001.fastq_reads_52526_8_1.rev_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
Assemble reads using the SPAdes assembler.
This app completed without errors in 15m 42s.
Objects
Created Object Name Type Description
132_SPAdes.contigs Assembly Assembled contigs
Summary
Assembly saved to: nasimaghbooli:narrative_1575949707349/132_SPAdes.contigs Assembled into 133 contigs. Avg Length: 43473.7443609 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 70 -- 509.0 to 26611.3 bp 24 -- 26611.3 to 52713.6 bp 20 -- 52713.6 to 78815.9 bp 4 -- 78815.9 to 104918.2 bp 5 -- 104918.2 to 131020.5 bp 4 -- 131020.5 to 157122.8 bp 2 -- 157122.8 to 183225.1 bp 1 -- 183225.1 to 209327.4 bp 1 -- 209327.4 to 235429.7 bp 2 -- 235429.7 to 261532.0 bp
Links
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.
This app completed without errors in 8m 38s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM
Annotate a bacterial or archaeal assembly using components from the RAST (Rapid Annotations using Subsystems Technology) toolkit (RASTtk).
This app completed without errors in 6m 25s.
Objects
Created Object Name Type Description
132_genome_annotate Genome Annotated genome
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 133 contigs containing 5782008 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 5448 new features were called, of which 143 are non-coding.
Output genome has the following feature types:
	Coding gene                     5305 
	Non-coding repeat                 75 
	Non-coding rna                    68 
Overall, the genes have 3592 distinct functions. 
The genes include 2084 genes with a SEED annotation ontology across 1560 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Output from Annotate Microbial Assembly
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/52526
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB)
This app completed without errors in 38m 52s.
Links
Add one or more genomes to a KBase species tree.
This app completed without errors in 2m 42s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/52526
  • nasim_132.newick
  • nasim_132-labels.newick
  • nasim_132.png
  • nasim_132.pdf

Apps

  1. Annotate Assembly and Re-annotate Genomes with Prokka - v1.14.5
    • Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30: 2068 2069. doi:10.1093/bioinformatics/btu153
  2. Annotate Domains in a Genome
    • Altschul SF, Madden TL, Sch ffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389 3402. doi:10.1093/nar/25.17.3389
    • Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10: 421. doi:10.1186/1471-2105-10-421
    • Eddy SR. Accelerated Profile HMM Searches. PLOS Computational Biology. 2011;7: e1002195. doi:10.1371/journal.pcbi.1002195
    • Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44: D279 D285. doi:10.1093/nar/gkv1344
    • Haft DH, Selengut JD, Richter RA, Harkins D, Basu MK, Beck E. TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res. 2013;41: D387 D395. doi:10.1093/nar/gks1234
    • Letunic I, Bork P. 20 years of the SMART protein domain annotation resource. Nucleic Acids Res. 2018;46: D493 D496. doi:10.1093/nar/gkx922
    • Letunic I, Doerks T, Bork P. SMART: recent updates, new developments and status in 2015. Nucleic Acids Res. 2015;43: D257-260. doi:10.1093/nar/gku949
    • Marchler-Bauer A, Bo Y, Han L, He J, Lanczycki CJ, Lu S, et al. CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Res. 2017;45: D200 D203. doi:10.1093/nar/gkw1129
    • Selengut JD, Haft DH, Davidsen T, Ganapathy A, Gwinn-Giglio M, Nelson WC, et al. TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes. Nucleic Acids Res. 2007;35: D260-264. doi:10.1093/nar/gkl1043
    • Tatusov RL, Koonin EV, Lipman DJ. A Genomic Perspective on Protein Families. Science. 1997;278: 631 637. doi:10.1126/science.278.5338.631
  3. Annotate Domains in a GenomeSet
    • Altschul SF, Madden TL, Sch ffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389 3402. doi:10.1093/nar/25.17.3389
    • Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10: 421. doi:10.1186/1471-2105-10-421
    • Eddy SR. Accelerated Profile HMM Searches. PLOS Computational Biology. 2011;7: e1002195. doi:10.1371/journal.pcbi.1002195
    • Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44: D279 D285. doi:10.1093/nar/gkv1344
    • Haft DH, Selengut JD, Richter RA, Harkins D, Basu MK, Beck E. TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res. 2013;41: D387 D395. doi:10.1093/nar/gks1234
    • Letunic I, Bork P. 20 years of the SMART protein domain annotation resource. Nucleic Acids Res. 2018;46: D493 D496. doi:10.1093/nar/gkx922
    • Letunic I, Doerks T, Bork P. SMART: recent updates, new developments and status in 2015. Nucleic Acids Res. 2015;43: D257-260. doi:10.1093/nar/gku949
    • Marchler-Bauer A, Bo Y, Han L, He J, Lanczycki CJ, Lu S, et al. CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Res. 2017;45: D200 D203. doi:10.1093/nar/gkw1129
    • Selengut JD, Haft DH, Davidsen T, Ganapathy A, Gwinn-Giglio M, Nelson WC, et al. TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes. Nucleic Acids Res. 2007;35: D260-264. doi:10.1093/nar/gkl1043
    • Tatusov RL, Koonin EV, Lipman DJ. A Genomic Perspective on Protein Families. Science. 1997;278: 631 637. doi:10.1126/science.278.5338.631
  4. Annotate Microbial Assembly with RASTtk - v1.073
    • [1] Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75. doi:10.1186/1471-2164-9-75
    • [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al.vThe SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226
    • [3] Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep. 2015;5. doi:10.1038/srep08365
    • [4] Kent WJ. BLAT The BLAST-Like Alignment Tool. Genome Res. 2002;12: 656 664. doi:10.1101/gr.229202
    • [5] Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389-3402. doi:10.1093/nar/25.17.3389
    • [6] Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25: 955 964.
    • [7] Cobucci-Ponzano B, Rossi M, Moracci M. Translational recoding in archaea. Extremophiles. 2012;16: 793 803. doi:10.1007/s00792-012-0482-8
    • [8] Meyer F, Overbeek R, Rodriguez A. FIGfams: yet another set of protein families. Nucleic Acids Res. 2009;37 6643-54. doi:10.1093/nar/gkp698.
    • [9] van Belkum A, Sluijuter M, de Groot R, Verbrugh H, Hermans PW. Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1996;34: 1176 1179.
    • [10] Croucher NJ, Vernikos GS, Parkhill J, Bentley SD. Identification, variation and transcription of pneumococcal repeat sequences. BMC Genomics. 2011;12: 120. doi:10.1186/1471-2164-12-120
    • [11] Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. doi:10.1186/1471-2105-11-119
    • [12] Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007;23: 673 679. doi:10.1093/bioinformatics/btm009
    • [13] Akhter S, Aziz RK, Edwards RA. PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies. Nucleic Acids Res. 2012;40: e126. doi:10.1093/nar/gks406
  5. Assemble Reads with SPAdes - v3.15.3
    • Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing. Journal of Computational Biology. 2012;19: 455-477. doi: 10.1089/cmb.2012.0021
    • Prjibelski A, Antipov D, Meleshko D, Lapidus A, Korobeynikov A. Using SPAdes De Novo Assembler. Curr Protoc Bioinformatics. 2020 Jun;70(1):e102. doi: 10.1002/cpbi.102.
  6. Assess Genome Quality with CheckM - v1.0.18
    • Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25: 1043 1055. doi:10.1101/gr.186072.114
    • CheckM source:
    • Additional info:
  7. Assess Read Quality with FastQC - v0.11.9
    • FastQC source: Bioinformatics Group at the Babraham Institute, UK.
  8. Compute ANI with FastANI
    • [1] Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High-throughput ANI Analysis of 90K Prokaryotic Genomes Reveals Clear Species Boundaries. 2017; doi:10.1101/225342
    • [2] Goris J, Konstantinidis KT, Klappenbach JA, Coenye T, Vandamme P, Tiedje JM. DNA-DNA hybridization values and their relationship to whole-genome sequence similarities. Int J Syst Evol Microbiol. 2007;57: 81 91. doi:10.1099/ijs.0.64483-0
    • FastANI module and source code:
  9. GTDB-Tk Classify - v1.6.0
    • Pierre-Alain Chaumeil, Aaron J Mussig, Philip Hugenholtz, Donovan H Parks, GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database, Bioinformatics, Volume 36, Issue 6, 15 March 2020, Pages 1925 1927. DOI: https://doi.org/10.1093/bioinformatics/btz848
    • Parks, D., Chuvochina, M., Waite, D. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 36, 996 1004 (2018). DOI: https://doi.org/10.1038/nbt.4229
    • Parks DH, Chuvochina M, Chaumeil PA, Rinke C, Mussig AJ, Hugenholtz P. A complete domain-to-species taxonomy for Bacteria and Archaea [published online ahead of print, 2020 Apr 27]. Nat Biotechnol. 2020;10.1038/s41587-020-0501-8. DOI:10.1038/s41587-020-0501-8
    • Matsen FA, Kodner RB, Armbrust EV. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics. 2010;11:538. Published 2010 Oct 30. doi:10.1186/1471-2105-11-538
    • Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018;9(1):5114. Published 2018 Nov 30. DOI:10.1038/s41467-018-07641-9
    • Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119. Published 2010 Mar 8. DOI:10.1186/1471-2105-11-119
    • Price MN, Dehal PS, Arkin AP. FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5(3):e9490. Published 2010 Mar 10. DOI:10.1371/journal.pone.0009490 link: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2835736/
    • Eddy SR. Accelerated Profile HMM Searches. PLoS Comput Biol. 2011;7(10):e1002195. DOI:10.1371/journal.pcbi.1002195
  10. Import FASTQ/SRA File as Reads from Staging Area
    no citations
  11. Insert Genome Into SpeciesTree - v2.2.0
    • Price MN, Dehal PS, Arkin AP. FastTree 2 Approximately Maximum-Likelihood Trees for Large Alignments. PLoS One. 2010;5. doi:10.1371/journal.pone.0009490