App Catalog
Sign Up Sign In
Annotate with Snekmer Apply
By: jjacobson

Launch

Annotate Genome or Protein Sequence Set Object with Snekmer Apply.

This App leverages Snekmer for its Apply function, a powerful tool designed to re-encode amino acid sequences, process them into kmers and predict protein sequence function. The Apply function utilizes pre-built kmer counts matrices from 1000 InterPro genomes for TIGRFams, Pfam, and PANTHER annotations.

The kmerized protein sequences are compared to each family in the prebuilt matrices using cosine similarity to find the most likely protein function. Cosine similarity is a similarity metric that vectorizes objects into N dimensions, where N is the number of kmers, and then measures the angle between the vectors. It is commonly used in text analysis.

The prebuilt kmer count matrices were created using Snekmer Learn, a feature not yet available to KBase.

Snekmer Apply Key Features:

The output of Snekmer Apply is an updated object with new protein sequence / gene ontologies. The secondary output provides predictions and confidence levels which may be directly downloaded.

Related Publications


App Specification:

https://github.com/jjacobson95/KbaseSnekmerLA/tree/5e6a3f106099b96eb0f1a41265175a5249e55b94/ui/narrative/methods/run_SnekmerLearnApply

Module Commit: 5e6a3f106099b96eb0f1a41265175a5249e55b94