Assemble the transcripts from RNA-seq read alignments using StringTie.
This App assembles transcripts for a given sample or a sample set using StringTie and generates an Expression object for each individual sample and an ExpressionSet object for the sample set. The user can view the relative abundances of the assembled transcripts in a histogram that is also generated by this App.
StringTie is a successor of Cufflinks that is faster and provides a more accurate reconstruction of genes and expression level. It accepts aligned RNA-seq reads from HISAT2, TopHat2 or Bowtie2 and assembles the alignments into a parsimonious set of transcripts. It then estimates the relative abundances of these transcripts based on how many reads support each one, taking into account biases in library preparation protocols.
The StringTie output object contains GTF (transcripts.gtf) and FPKM (genes.fpkm_tracking) files. The GTF file contains annotated transcripts assembled by StringTie whereas the FPKM file provides the normalized ExpressionMatrix objects (abundance of each transcript expressed in fragments per kilobase of exon per million fragments mapped (FPKM) and transcripts per kilobase million (TPM)). The output RNASeqExpression objects can be rendered in the Narrative in tabular and histogram formats to visualize the abundance of normalized gene expression value in both log2(FPKM+1) and log2(TPM+1).
The StringTie output object can be used to identify differential expression either using DESeq2, Cuffdiff, or Ballgown.
NOTE: This App is one of the steps in the Transcriptomics and Expression Analysis Workflow in KBase.
Team members who developed & deployed algorithm in KBase: Tianhao Gu, Christopher Henry, Shane Canon, Stephen Chan, Jason Baumohl, Sean McCorkle, Sunita Kumari, Shinjae Yoo, Priya Ranjan, and Vivek Kumar. For questions, please contact us.
Related Publications
- Frazee AC, Pertea G, Jaffe AE, Langmead B, Salzberg SL, Leek JT. Ballgown bridges the gap between transcriptome assembly and expression analysis. Nat Biotechnol. 2015;33: 243 246. doi:10.1038/nbt.3172 , https://www.nature.com/articles/nbt.3172
- https://www.nature.com/articles/nmeth.3317, https://www.nature.com/articles/nmeth.3317
- Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biology. 2013;14: R36. doi:10.1186/gb-2013-14-4-r36 , https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-4-r36
- Pertea M, Pertea GM, Antonescu CM, Chang T-C, Mendell JT, Salzberg SL. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nature Biotechnology. 2015;33: 290 295. doi:10.1038/nbt.3122 , https://www.nature.com/articles/nbt.3122
- Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25: 1105 1111. doi:10.1093/bioinformatics/btp120 , https://academic.oup.com/bioinformatics/article/25/9/1105/203994
- Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc. 2012;7: 562 578. doi:10.1038/nprot.2012.016 , https://www.nature.com/articles/nprot.2012.016
App Specification:
https://github.com/kbaseapps/kb_stringtie/tree/91fe36084ccf8cde35534514ac592609b65d5343/ui/narrative/methods/run_stringtieModule Commit: 91fe36084ccf8cde35534514ac592609b65d5343
 
            