KBase Project Paper

Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163


Over the past two decades, the scale and complexity of genomics technologies and data have advanced from sequencing genomes of a few organisms to generating metagenomes, genome variation, gene expression, metabolites, and phenotype data for thousands of organisms and their communities. A major challenge in this data-rich age of biology is integrating heterogeneous and distributed data into predictive models of biological function, ranging from a single gene to entire organisms and their ecologies. The US Department of Energy (DOE) has invested substantially in efforts to understand the complex interplay between biological and abiotic processes that influence soil, water, and environmental dynamics of our biosphere. The community that has grown around these efforts recognizes the need for scientists of diverse backgrounds to have access to sophisticated computational tools that enable them to analyze complex and heterogeneous data sets and integrate their data and results effectively with the work of others. In this way, new data and conclusions can be rapidly propagated across existing, related analyses and easily discovered by the community for evaluation and comparison with previous results.

Here we present the DOE Systems Biology Knowledgebase (KBase, http://kbase.us), an open-source software and data platform that enables data sharing, integration, and analysis of microbes, plants, and their communities.

KBase Paper Narratives

The KBase paper discusses a series of linked Narratives that illustrate a scenario wherein two scientists use KBase to perform collaborative systems biology analysis, resulting in a reproducible, interactive “publication.” These example Narratives demonstrate how KBase facilitates sharing, collaboration, and interdisciplinary research between two scientists: Alice, a wet-lab biologist with expertise in assembly, annotation, and comparative genomics, and Bob, a computational biologist with expertise in metabolic modeling. Using KBase’s tools and data, Alice and Bob collaborate to create and refine a metabolic model for a new strain of a bacterium, starting with its genomic sequence. KBase enables these scientists to accomplish more together than they could individually, with less work and in less time.

The five “Alice and Bob” Narratives described in the paper are linked below. You can copy these Narratives and rerun the steps or even try them on your own data. Please note that you will need a KBase account in order to view or rerun the Narratives.

Alice Narrative 1: Assembly and Annotation  https://narrative.kbase.us/narrative/ws.18152.obj.1
Alice Narrative 2: Comparative Genomics  https://narrative.kbase.us/narrative/ws.18153.obj.1
Bob Narrative: Build Metabolic Models  https://narrative.kbase.us/narrative/ws.18155.obj.1
Bob and Alice Narrative 1: Phenotype Data Analysis  https://narrative.kbase.us/narrative/ws.18156.obj.1
Bob and Alice Narrative 2: Phenotype Data Reconciliation  https://narrative.kbase.us/narrative/ws.18157.obj.1

To see other Narratives that demonstrate how to use KBase to carry out computational experiments, please visit our Narrative Library.