App Catalog
Sign Up Sign In
tBLASTn prot-nuc Search - v2.7.1
kb_blast

v.1.0.6

By: dylan

Launch

Search for matches to a sequence

This method performs a prot-nuc (translated protein sequence alignment) tBLASTn Search using NCBI's BLAST+ (version 2.6.0)

Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. (2009) "BLAST+: architecture and applications." BMC Bioinformatics. 2009 Dec 15;10:421. doi: 10.1186/1471-2105-10-421

Altschul SF, Madden TL, Sch ffer AA, Zhang J, Zhang Z, Miller W, & Lipman DJ. (1997) "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs." Nucleic Acids Res. 25:3389-3402. doi: 10.1093/nar/25.17.3389


Implemented for KBase by Dylan Chivian (DCChivian@lbl.gov)


tBLASTn is a protein sequence search against a translated nucleotide sequence database. The KBase implementation permits searching through the genes in a Genome object, the genes in the Genome members of a GenomeSet, or the genes in a FeatureSet. The output object of these searches is a FeatureSet containing those genes that pass the thresholds given by the user. The App also provides a table of the hits (with those hits that are below the thresholds in gray) and links to download other formats of BLAST output files.


Configuration:

Query Object: If you don't provide a Query Sequence, you must instead provide a query object. This may be a gene or a Sequence Set object with a single sequence. This latter object will be available after running tBLASTn once, and can be used in subsequent runs. It must contain amino acid sequence.

Query Sequence: If you don't provide a Query Object, you must cut-and-paste a query sequence. In addition, you must name the Output Query Object in which to save the query sequence. It must be a one letter code amino acid sequence.

Targets Object: The Targets Object may be a FeatureSet of genes, a Genome, or a GenomeSet. A BLAST search database will be automatically generated from the Targets Object.

Output Query Object: If your query is input as a cut-and-paste sequence, you must name the object in which to store that query.

Output Object: This is the set of genes that are both hit and pass user-defined thresholds.

E-value: This bounds the e-value for the weakest hit to consider viable. Values below this do not get reported in the table or the BLAST output text downloads.

Bitscore: This bounds the bitscore for the weakest hit to include in the FeatureSet output object. Hits below this threshold are still reported in the table and BLAST text downloads.

Identity Threshold (%): This bounds the sequence identity between the query and each hit for the weakest hit to include in the FeatureSet output object. Hits below this threshold are still reported in the table and BLAST text downloads. Identity is calculated for amino acids.

Alignment Overlap Threshold (%): This bounds the overlap percentage (portion of the query length covered by the hit alignment) for inclusion in the FeatureSet output object. Hits below this threshold are still reported in the table and BLAST text downloads.

Max Accepts: Hard cap on how many hits to report (Default: 1000)

Extra Text Output format: The BLAST m=7 (tab-delimited table) text output format is available automatically for download. A user may request up to one extra format to be generated and downloadable. These include


Output:

Output Object: Gene hits are captured in a FeatureSet output object. If there are additional user-defined thresholds, those are filtered out and do not appear in the object, even if they are shown in the output table.

Output HTML Table: The tab-delimited hit table is HTML formatted and additionally shows the region of the query covered by the BLAST alignment. Hits that are above the e-value threshold but below other thresholds and are not included in the FeatureSet output object are shown in gray, with the attributes that were below the threshold in red.

Downloadable files: BLAST text outputs that are requested (as indicated in Configuration above) are available for download. These are not altered from the direct output from the BLAST run. The m=7 (tab-delimited) format is always provided.


App Specification:

https://github.com/kbaseapps/kb_blast/tree/d65ca570cd7e336f5d99329d4d9c032f63056f31/ui/narrative/methods/tBLASTn_Search

Module Commit: d65ca570cd7e336f5d99329d4d9c032f63056f31