Methods
Computational Pipeline
Sequencing
Libraries were sequenced on an Illumina NextSeq producing 2x150 bp paired-end reads. Each sample contained 2,071,301 ± 409,888 reads, excluding one failed sample with < 2,000 reads.
The raw reads for each isolate were uploaded to this Narrative.
The PairedEndLibrary objects are called ISOLATE_NAME-alm-2017-05-19.reads
Adapter removal with Cutadapt
The program Cutadapt v1.12 was used to remove adapter sequences with parameters -a CTGTCTCTTAT -A CTGTCTCTTAT
(Martin, 2011).
Read Trimming with Trimmomatic
The Illumina sequencing reads were trimmed using Trimmomatic 0.36, with parameters "-phred33 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36 ILLUMINACLIP:TruSeq3-PE.fa"
(Bolger et al., 2014).
Assembly with SPAdes
The trimmed reads were assembled de novo using SPAdes v3.9.0 with parameters "-k 21,33,55,77"
(Bankevich et al., 2012).
The contigs for each isolate were uploaded to this Narrative.
Annotation with Prokka
Genes were identified using Prokka v1.12, with default parameters
(Seemann, 2014).
The annotated genome for each isolate were uploaded to this Narrative.
GTDB-Tk classification
This was run within KBase to obtain taxonomic assignments for each genome; see information and reference in the app cell below.