Ecogenomics of groundwater viruses suggests niche differentiation linked to specific environmental tolerance¶

Ankita Kothari, Simon Roux, Hanqiao Zhang, Anatori Prieto, John-Marc Chandonia, Sarah Spencer, Xiaoqin Wu, Adam M. Deutschbauer, Adam P. Arkin, Eric J. Alm, Romy Chakraborty, Aindrila Mukhopadhyay

Submitted to mBio

Methods

Computational Pipeline

Sequencing

Libraries were sequenced on an Illumina NextSeq producing 2x150 bp paired-end reads. Each sample contained 2,071,301 ± 409,888 reads, excluding one failed sample with < 2,000 reads.

The raw reads for each isolate were uploaded to this Narrative.
The PairedEndLibrary objects are called ISOLATE_NAME-alm-2017-05-19.reads

Adapter removal with Cutadapt

The program Cutadapt v1.12 was used to remove adapter sequences with parameters -a CTGTCTCTTAT -A CTGTCTCTTAT (Martin, 2011).

Read Trimming with Trimmomatic

The Illumina sequencing reads were trimmed using Trimmomatic 0.36, with parameters "-phred33 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36 ILLUMINACLIP:TruSeq3-PE.fa" (Bolger et al., 2014).

Assembly with SPAdes

The trimmed reads were assembled de novo using SPAdes v3.9.0 with parameters "-k 21,33,55,77" (Bankevich et al., 2012).

The contigs for each isolate were uploaded to this Narrative.

Annotation with Prokka

Genes were identified using Prokka v1.12, with default parameters (Seemann, 2014).

The annotated genome for each isolate were uploaded to this Narrative.

GTDB-Tk classification

This was run within KBase to obtain taxonomic assignments for each genome; see information and reference in the app cell below.

References

Martin M. (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal. 2011;17: 10 12. doi:10.14806/ej.17.1.200
Bolger, A.M., Lohse, M., and Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15), 2114-2120.
Bankevich, A., Nurk, S., Antipov, D., Gurevich, A.A., Dvorkin, M., Kulikov, A.S., et al. (2012). SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. Journal of computational biology 19(5), 455-477.
Seemann, T. (2014). Prokka: rapid prokaryotic genome annotation. Bioinformatics 30(14), 2068-2069.