Search for matches to a sequence
This method performs a nuc-prot (translated protein sequence alignment) BLASTx Search using NCBI's BLAST+ (version 2.6.0)
Altschul SF, Madden TL, Sch ffer AA, Zhang J, Zhang Z, Miller W, & Lipman DJ. (1997) "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs." Nucleic Acids Res. 25:3389-3402. doi: 10.1093/nar/25.17.3389
BLASTx is a translated nucleotide sequence search against a protein sequence database. The KBase implementation permits searching through the genes in a Genome object, the genes in the Genome members of a GenomeSet, or the genes in a FeatureSet. The output object of these searches is a FeatureSet containing those genes that pass the thresholds given by the user. The App also provides a table of the hits (with those hits that are below the thresholds in gray) and links to download other formats of BLAST output files.
Query Object: If you don't provide a Query Sequence, you must instead provide a query object. This may be a gene or a Sequence Set object with a single sequence. This latter object will be available after running BLASTx once, and can be used in subsequent runs. It must contain nucleic acid sequence.
Query Sequence: If you don't provide a Query Object, you must cut-and-paste a query sequence. In addition, you must name the Output Query Object in which to save the query sequence. It must be a one letter code nucleic acid sequence.
Targets Object: The Targets Object may be a FeatureSet of genes, a Genome, or a GenomeSet. A BLAST search database will be automatically generated from the Targets Object.
Output Query Object: If your query is input as a cut-and-paste sequence, you must name the object in which to store that query.
Output Object: This is the set of genes that are both hit and pass user-defined thresholds.
E-value: This bounds the e-value for the weakest hit to consider viable. Values below this do not get reported in the table or the BLAST output text downloads.
Bitscore: This bounds the bitscore for the weakest hit to include in the FeatureSet output object. Hits below this threshold are still reported in the table and BLAST text downloads.
Identity Threshold (%): This bounds the sequence identity between the query and each hit for the weakest hit to include in the FeatureSet output object. Hits below this threshold are still reported in the table and BLAST text downloads. Identity is calculated for amino acids.
Alignment Overlap Threshold (%): This bounds the overlap percentage (portion of the query length covered by the hit alignment) for inclusion in the FeatureSet output object. Hits below this threshold are still reported in the table and BLAST text downloads.
Max Accepts: Hard cap on how many hits to report (Default: 1000)
Extra Text Output format: The BLAST m=7 (tab-delimited table) text output format is available automatically for download. A user may request up to one extra format to be generated and downloadable. These include
- 0 (pairwise)
- 1 (query-anchored showing identities)
- 2 (query-anchored no identities)
- 3 (flat query-anchored, show identities)
- 4 (flat query-anchored, no identities)
- 5 (XML Blast output)
- 8 (Text ASN.1)
- 9 (Binary ASN.1)
- 10 (Comma-separated values)
- 11 (BLAST archive format ASN.1)
Output Object: Gene hits are captured in a FeatureSet output object. If there are additional user-defined thresholds, those are filtered out and do not appear in the object, even if they are shown in the output table.
Output HTML Table: The tab-delimited hit table is HTML formatted and additionally shows the region of the query covered by the BLAST alignment. Hits that are above the e-value threshold but below other thresholds and are not included in the FeatureSet output object are shown in gray, with the attributes that were below the threshold in red.
Downloadable files: BLAST text outputs that are requested (as indicated in Configuration above) are available for download. These are not altered from the direct output from the BLAST run. The m=7 (tab-delimited) format is always provided.
Module Commit: d65ca570cd7e336f5d99329d4d9c032f63056f31