Compute bi-directional-best-hits between the proteins present in two input Genomes. Produces a dot plot matrix showing corresponding genes in two Genomes, as well as a table of gene differences.
This App performs an all-vs-all protein comparison for a pair of species based on BLAST output. The algorithm is similar to the BUS approach. It represents the best match of every gene as a set of genes instead of a single best hit, which makes it more robust to slight differences in sequence similarity. The similarities between the genes are treated as a bipartite graph connecting genes between two species by edges weighed as bit-scores of corresponding protein pairwise alignments. For every edge, the "Sub-optimal Best Bidirectional Hit Ratio" is calculated as a ratio of the weight of given edge to the best weight among edges connected to this edge. If this ratio is 100% then the edge is a true Best Bidirectional Hit (in terms of bit-score). All edges with a ratio below a specified minimum threshold are filtered out.
The output of this App is visualized as a dot plot matrix showing pairs of similar proteins determined from the algorithm. This dot plot can be navigated by users with the provided zoom and scroll functions. Hovering over colored dots in the dot plot displays information about that pair of proteins. Clicking on a dot displays a column browser which allows one to navigate to the next or previous similar proteins in either genome. A larger red dot in the plot indicates the current pair of proteins.
The advanced parameter for 'Minimum sub-optimal BBH ratio' affects the dot color. The BBH stands for 'bi-directional best hit' and has a default of 90. The dots in white are better than this ratio and the dots in red are lower than this value.
A tutorial for how to use the Compare Two Proteomes App can be found here.
Team members who developed & deployed algorithm in KBase: Roman Sutormin. For questions, please contact us.
- Kellis M, Patterson N, Birren B, Berger B, Lander ES. Methods in comparative genomics: genome correspondence, gene identification and regulatory motif discovery. J Comput Biol. 2004;11: 319 355. doi:10.1089/1066527041410319 , http://www.ncbi.nlm.nih.gov/pubmed/15285895
- Manolis Kellis Ph.D. Thesis, Chapter 1: Genome correspondence , http://web.mit.edu/manoli/www/thesis/Chapter1.html
Module Commit: f96f62941820a793124a18fe6e655da33a562033