DOI: https://doi.org/10.1101/2020.04.27.058388
This KBase narrative contains RNA-Seq data processing for Adler et al., "Systematic Discovery of Salmonella Phage-Host Interactions via High-Throughput Genome-Wide Screens", 2020. The goal of this experiment was to identify shared transcription-level difference between phage cross-resistant mutants, ∆trkH, ∆sapB, ∆rpoN, ∆himA relative to wild-type S. typhimurium MS1868. In this narrative, we processed RNA-Seq reads into differential expression analysis datasets. We began from pre-loaded RNA-Seq reads (PairedEndLibrary objects) and output datasets available in Supplementary Datasets 5 and 6 (doi.org/10.6084/m9.figshare.12185031).
Before the narrative begins, upload RNA-Seq datasets to KBase as PairedEndLibrary Objects. Sequencing results are the product of HiSeq4000 sequencing using 100PE runs (see Methods).
Upload a suitable, annotated reference genome serving as the basis for RNA-Seq alignment (S. typhimurium LT2 genome and PSLT plasmids). These are based off of RefSeq accession numbers NC_003197.2 and NC_003277.2 respectively.
Trim RNA-Seq reads with Trimmomatic for each PairedEndLibrary. This step removes technical sequences such as indexes using in Illumina sequencing as well as removing reads of insufficient quality.
Establish RNA-Seq Sample Set. This step groups PairedEndLibraries by grouping (see Sample Nomenclature). From hereon out, HiSAT Alignment and DE-Seq differential expression can be performed using this object to simplify the process.
Align trimmed RNA-Seq reads to the LT2 + PSLT reference genome using HISAT2. This step provides position-level coverage for each sample.
Assemble transcripts based off of the HISAT2 alignments using StringTie. This step provides gene-annotation-level coverage for each sample and creates the output seen in Supplementary Dataset 5. Because sample processing as described in (Methods) yielded larger fragments than most sRNA transcripts, sRNAs abundances are primarily depleted.
Perform differential expression calculations using DESeq2. This step calculates gene-annotation expression-level differences by condition. Of note, this step calculates the all-by-all differential expression matrix, but only differential expression against wild-type was used. This creates the output seen in Supplementary Dataset 6.
Biological triplicate RNA-Seq experiments of:
References:
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Lucchini, S., McDermott, P., Thompson, A. & Hinton, J. C. D. The H-NS-like protein StpA represses the RpoS (sigma 38) regulon during exponential growth of Salmonella Typhimurium. Mol. Microbiol. 74, 1169–1186 (2009).
Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).
Uploaded a suitable, annotated reference genome serving as the basis for RNA-Seq alignment (S. typhimurium LT2 genome and PSLT plasmids). These are based off of RefSeq accession numbers NC_003197.2 and NC_003277.2, respectively. Of note, a key difference between the reference genome NC_003197.2 and the strain used, S. typhimurium MS1868 is that MS1868 does not have the prophage Fels2. As such, we expect a drop in alignment quality from nucleotides 2844431 - 2879237.
Genomes that will be used as a reference during HISAT2 Alignment (Step 4) and onwards.
Created Object Name | Type | Description |
---|---|---|
LT2_gff3.gff_genome | Genome | Imported Genome |
Trim RNA-Seq reads with Trimmomatic for each PairedEndLibrary. This step removes technical sequences such as indexes using in Illumina sequencing as well as removing reads of insufficient quality. TruSeq3-PE adapters will be removed. This step is performed for each PairedEndLibrary, so this process was performed 14 times, once for each library.
Each report refers to a single sequenced sample. This gives insight into how many reads were removed from further processing. Given that most reads were retained, no action was needed.
Created Object Name | Type | Description |
---|---|---|
BA_A1_trimmed_paired | PairedEndLibrary | Trimmed Reads |
BA_A1_trimmed_unpaired_fwd | SingleEndLibrary | Trimmed Unpaired Forward Reads |
BA_A1_trimmed_unpaired_rev | SingleEndLibrary | Trimmed Unpaired Reverse Reads |
Created Object Name | Type | Description |
---|---|---|
BA_A2_trimmed_paired | PairedEndLibrary | Trimmed Reads |
BA_A2_trimmed_unpaired_fwd | SingleEndLibrary | Trimmed Unpaired Forward Reads |
BA_A2_trimmed_unpaired_rev | SingleEndLibrary | Trimmed Unpaired Reverse Reads |
Created Object Name | Type | Description |
---|---|---|
BA_A3_trimmed_paired | PairedEndLibrary | Trimmed Reads |
BA_A3_trimmed_unpaired_fwd | SingleEndLibrary | Trimmed Unpaired Forward Reads |
BA_A3_trimmed_unpaired_rev | SingleEndLibrary | Trimmed Unpaired Reverse Reads |
Created Object Name | Type | Description |
---|---|---|
BA_B1_trimmed_paired | PairedEndLibrary | Trimmed Reads |
BA_B1_trimmed_unpaired_fwd | SingleEndLibrary | Trimmed Unpaired Forward Reads |
BA_B1_trimmed_unpaired_rev | SingleEndLibrary | Trimmed Unpaired Reverse Reads |
Created Object Name | Type | Description |
---|---|---|
BA_B2_trimmed_paired | PairedEndLibrary | Trimmed Reads |
BA_B2_trimmed_unpaired_fwd | SingleEndLibrary | Trimmed Unpaired Forward Reads |
BA_B2_trimmed_unpaired_rev | SingleEndLibrary | Trimmed Unpaired Reverse Reads |
Created Object Name | Type | Description |
---|---|---|
BA_B3_trimmed_paired | PairedEndLibrary | Trimmed Reads |
BA_B3_trimmed_unpaired_fwd | SingleEndLibrary | Trimmed Unpaired Forward Reads |
BA_B3_trimmed_unpaired_rev | SingleEndLibrary | Trimmed Unpaired Reverse Reads |
Created Object Name | Type | Description |
---|---|---|
BA_C1_trimmed_paired | PairedEndLibrary | Trimmed Reads |
BA_C1_trimmed_unpaired_fwd | SingleEndLibrary | Trimmed Unpaired Forward Reads |
BA_C1_trimmed_unpaired_rev | SingleEndLibrary | Trimmed Unpaired Reverse Reads |
Created Object Name | Type | Description |
---|---|---|
BA_C2_trimmed_paired | PairedEndLibrary | Trimmed Reads |
BA_C2_trimmed_unpaired_fwd | SingleEndLibrary | Trimmed Unpaired Forward Reads |
BA_C2_trimmed_unpaired_rev | SingleEndLibrary | Trimmed Unpaired Reverse Reads |
Created Object Name | Type | Description |
---|---|---|
BA_C3_trimmed_paired | PairedEndLibrary | Trimmed Reads |
BA_C3_trimmed_unpaired_fwd | SingleEndLibrary | Trimmed Unpaired Forward Reads |
BA_C3_trimmed_unpaired_rev | SingleEndLibrary | Trimmed Unpaired Reverse Reads |
Created Object Name | Type | Description |
---|---|---|
BA_D1_trimmed_paired | PairedEndLibrary | Trimmed Reads |
BA_D1_trimmed_unpaired_fwd | SingleEndLibrary | Trimmed Unpaired Forward Reads |
BA_D1_trimmed_unpaired_rev | SingleEndLibrary | Trimmed Unpaired Reverse Reads |
Created Object Name | Type | Description |
---|---|---|
BA_D2_trimmed_paired | PairedEndLibrary | Trimmed Reads |
BA_D2_trimmed_unpaired_fwd | SingleEndLibrary | Trimmed Unpaired Forward Reads |
BA_D2_trimmed_unpaired_rev | SingleEndLibrary | Trimmed Unpaired Reverse Reads |
Created Object Name | Type | Description |
---|---|---|
BA_D3_TRIMMED_paired | PairedEndLibrary | Trimmed Reads |
BA_D3_TRIMMED_unpaired_fwd | SingleEndLibrary | Trimmed Unpaired Forward Reads |
BA_D3_TRIMMED_unpaired_rev | SingleEndLibrary | Trimmed Unpaired Reverse Reads |
Created Object Name | Type | Description |
---|---|---|
BA_E1_trimmed_paired | PairedEndLibrary | Trimmed Reads |
BA_E1_trimmed_unpaired_fwd | SingleEndLibrary | Trimmed Unpaired Forward Reads |
BA_E1_trimmed_unpaired_rev | SingleEndLibrary | Trimmed Unpaired Reverse Reads |
Created Object Name | Type | Description |
---|---|---|
BA_E2_trimmed_paired | PairedEndLibrary | Trimmed Reads |
BA_E2_trimmed_unpaired_fwd | SingleEndLibrary | Trimmed Unpaired Forward Reads |
BA_E2_trimmed_unpaired_rev | SingleEndLibrary | Trimmed Unpaired Reverse Reads |
Establish RNA-Seq Sample Set. This step groups PairedEndLibraries by grouping (see Sample Nomenclature). From here on out, HiSAT Alignment and DE-Seq differential expression can be performed using this object to simplify the process.
Align trimmed RNA-Seq reads to the LT2 + PSLT reference genome using HISAT2. This step provides position-level coverage for each sample.
Created Object Name | Type | Description |
---|---|---|
BA_B2_trimmed_paired_alignment_trimmed | RNASeqAlignment | Reads 48675/79/1;48675/31/1 aligned to Genome 48675/69/1 |
BA_E1_trimmed_paired_alignment_trimmed | RNASeqAlignment | Reads 48675/79/1;48675/73/2 aligned to Genome 48675/69/1 |
BA_A2_trimmed_paired_alignment_trimmed | RNASeqAlignment | Reads 48675/79/1;48675/16/1 aligned to Genome 48675/69/1 |
BA_A3_trimmed_paired_alignment_trimmed | RNASeqAlignment | Reads 48675/79/1;48675/30/1 aligned to Genome 48675/69/1 |
BA_C1_trimmed_paired_alignment_trimmed | RNASeqAlignment | Reads 48675/79/1;48675/28/1 aligned to Genome 48675/69/1 |
BA_D1_trimmed_paired_alignment_trimmed | RNASeqAlignment | Reads 48675/79/1;48675/50/1 aligned to Genome 48675/69/1 |
BA_C2_trimmed_paired_alignment_trimmed | RNASeqAlignment | Reads 48675/79/1;48675/44/1 aligned to Genome 48675/69/1 |
BA_B1_trimmed_paired_alignment_trimmed | RNASeqAlignment | Reads 48675/79/1;48675/25/1 aligned to Genome 48675/69/1 |
BA_D3_TRIMMED_paired_alignment_trimmed | RNASeqAlignment | Reads 48675/79/1;48675/57/1 aligned to Genome 48675/69/1 |
BA_C3_trimmed_paired_alignment_trimmed | RNASeqAlignment | Reads 48675/79/1;48675/49/1 aligned to Genome 48675/69/1 |
BA_A1_trimmed_paired_alignment_trimmed | RNASeqAlignment | Reads 48675/79/1;48675/20/1 aligned to Genome 48675/69/1 |
BA_B3_trimmed_paired_alignment_trimmed | RNASeqAlignment | Reads 48675/79/1;48675/24/2 aligned to Genome 48675/69/1 |
BA_D2_trimmed_paired_alignment_trimmed | RNASeqAlignment | Reads 48675/79/1;48675/66/2 aligned to Genome 48675/69/1 |
BA_E2_trimmed_paired_alignment_trimmed | RNASeqAlignment | Reads 48675/79/1;48675/62/1 aligned to Genome 48675/69/1 |
SE_reprocessed_trimmed_alignment_set_trimmed | ReadsAlignmentSet | Set of all new alignments |
Assemble transcripts based off of the HISAT2 alignments and S. typhimurium LT2 annotations using StringTie. This step provides gene-annotation-level coverage for each sample and creates the output seen in Supplementary Dataset 5. Because sample processing as described in Methods yielded larger fragments than most sRNA transcripts, sRNAs abundances are primarily depleted, but could be mapped if the sRNA were sufficiently long.
Qualitative analysis could be performed by looking at the interactive heatmap below (TPM). For instance in ∆sapB experiments (samples B1, B2, B3), sapB (STM1693) expression levels are expectedly lower relative to wild-type MS1868 (samples A1, A2, A3). Same for trkH (STM3986) (samples C1, C2, C3), rpoN (STM3320) (samples D1, D2, D3), and himA (STM1339) (samples E1, E2). For differential expression analysis, normalization and hypothesis testing using the DESeq2 output should be employed.
Created Object Name | Type | Description |
---|---|---|
SE_reprocessed_trimmed_expression_set_trimmed_trimmed | ExpressionSet | ExpressionSet generated by StringTie |
BA_A1_trimmed_paired_expression_trimmed_trimmed | RNASeqExpression | Expression generated by StringTie |
BA_A2_trimmed_paired_expression_trimmed_trimmed | RNASeqExpression | Expression generated by StringTie |
BA_A3_trimmed_paired_expression_trimmed_trimmed | RNASeqExpression | Expression generated by StringTie |
BA_B1_trimmed_paired_expression_trimmed_trimmed | RNASeqExpression | Expression generated by StringTie |
BA_B2_trimmed_paired_expression_trimmed_trimmed | RNASeqExpression | Expression generated by StringTie |
BA_B3_trimmed_paired_expression_trimmed_trimmed | RNASeqExpression | Expression generated by StringTie |
BA_C1_trimmed_paired_expression_trimmed_trimmed | RNASeqExpression | Expression generated by StringTie |
BA_C2_trimmed_paired_expression_trimmed_trimmed | RNASeqExpression | Expression generated by StringTie |
BA_C3_trimmed_paired_expression_trimmed_trimmed | RNASeqExpression | Expression generated by StringTie |
BA_D1_trimmed_paired_expression_trimmed_trimmed | RNASeqExpression | Expression generated by StringTie |
BA_D2_trimmed_paired_expression_trimmed_trimmed | RNASeqExpression | Expression generated by StringTie |
BA_D3_TRIMMED_paired_expression_trimmed_trimmed | RNASeqExpression | Expression generated by StringTie |
BA_E1_trimmed_paired_expression_trimmed_trimmed | RNASeqExpression | Expression generated by StringTie |
BA_E2_trimmed_paired_expression_trimmed_trimmed | RNASeqExpression | Expression generated by StringTie |
SE_reprocessed_trimmed_trimmed_trimmed_FPKM_ExpressionMatrix | ExpressionMatrix | FPKM ExpressionMatrix generated by StringTie |
SE_reprocessed_trimmed_trimmed_trimmed_TPM_ExpressionMatrix | ExpressionMatrix | TPM ExpressionMatrix generated by StringTie |
from biokbase.narrative.jobs.appmanager import AppManager
AppManager().run_local_app(
"NarrativeViewers/view_expression_interactive_heatmap",
{
"param0": "SE_reprocessed_trimmed_trimmed_trimmed_TPM_ExpressionMatrix"
},
tag="release",
version="1.0.7",
cell_id="b6167e7e-f7a8-46f5-809b-6336b6e4e57e",
run_id="ed274591-87ec-4e0c-8ab7-45d7ee23acbd"
)
Perform differential expression calculations using DESeq2. This step calculates gene-annotation expression-level differences by condition. Of note, this step calculates the all-by-all differential expression matrix, but only differential expression against wild-type was used. This creates the output seen in Supplementary Dataset 6.