There are several ways to find data and add it to your Narrative. You can select data available within KBase, upload your own files, or import datasets from external resources such as the DOE Joint Genome Institute (JGI) or NCBI. This section will describe how to explore data already in KBase. Instructions for adding data to your Narrative and importing external files are provided in the next section, “Add Data to Your Narrative.”
There are several ways to explore the wide range of data available in KBase.
The Data Slideout is accessed by clicking the right arrow or Add Data button in the Data Panel.
For now, we will find a public reference genome available under the Public tab and then examine information and metadata about that genome.
Clicking the Public tab displays a list of data objects available in KBase’s reference collection.
The data-type selector at the top left of the browser window defaults to Genomes (Note that Phytozome plant genomes are currently listed as another data type–this will change soon.) You can change which type of data is displayed under the Public tab by using this pulldown menu. More data types will be added soon.
You can also use the “Search data” field in the Data Browser to find data objects whose names include the text you’ve typed in the search box. (Note that searches are not case-sensitive). Enter “Vibrio brasiliensis” in the search field.
If you hover over a data object in the Data Browser panel, three small images appear: an Add button to the left of the object name and binoculars and a graph-like icon to the right. These latter two icons open a Data Landing page and a Provenance page, respectively (see below for details).
Go ahead and click the Add button to add the Vibrio brasiliensis LMG 20546 genome to your Narrative. Notice that this genome now shows up in your Data Panel (see image).
The Data Panel shows all the data that you’ve added to your Narrative. (The next section of this guide discusses in more detail how to add data to your Narrative.)
You should have at least one object (Vibrio brasiliensis LMG 20546) in your Data Panel. As you add or generate more data during the course of your analyses, the number of objects in this panel will increase. You can search, sort, or filter the list using the icons in the Data Panel header.
This expanded view of the data object also reveals icons that let you examine or manage the data.
Many (but not yet all) types of data in KBase have viewers that allow you to find out more about the data. These viewers can be accessed two ways from the Data Panel:
Below is the genome viewer for the Vibrio brasiliensis LMG 20546 genome that is in our Data Panel.
Notice that this genome viewer has tabs for an overview (including GC content, taxonomy information, size, and more) and a list of contigs and genes. Each contig and gene entry in these lists is clickable, opening either a contig browser or a tab with expanded information about the gene (see image.)
You can sort the table entries by clicking on a column header to sort by that field (e.g., Length). Clicking the same column header again will reverse the sort order. For example, the screenshots below show the table sorted in descending order by Contig name and then in descending order by length:
You can even sort by more than one column at a time by clicking one column header and then Shift-clicking other column headers. For example, here we have sorted in ascending order by contig length, and then (by shift-clicking the Genes column header) in ascending or descending order by number of genes. Notice how the two rows with length=253 (the bottom two, in these screenshots) have switched places.
There are many other types of viewers in KBase in addition to the Genome viewer discussed here. (Documentation on these other viewers is coming soon.) Different viewers may have different options; give them a whirl! Don’t be afraid to click on anything you see in a viewer.
To remove a viewer from your Narrative, click the trashcan icon in the top right of the viewer cell. (You can always re-add the viewer; removing it from your Narrative doesn’t delete the data object itself.)
Data Landing (also known as Data Summary) and Provenance pages are ways to find out even more information about a data object. You can access these pages using the icons that appear when you hover over a data object in the Data Browser or click on one of the objects in your Data Panel. The binoculars icon opens (in a new browser tab) a Data Landing page about that particular data object, while the graph-like icon opens a Provenance page.
Data Landing pages (which are still in development) provide both known and contextual information about a data object, allowing users to examine various particulars about the data and, eventually, compare it to other data objects. Depending on the type of data and where it came from, different sorts of information might be presented. For example, for a genome already in KBase (such as the one we loaded earlier, Vibrio brasilensis LMG 20546), you will see:
Provenance pages are one way to facilitate reproducibility and transparency of scientific results, two key principles of KBase design. All data in KBase is versioned, and older versions of a data object can be accessed from the Provenance page. This page records and illustrates how data is derived and modified in KBase, including how it entered the system, whether it was produced through analysis of other objects, who generated the data, and when. You also can identify the original “owners” of the data, allowing you to contact them or see their shared analyses.
The image below shows the Provenance page for a flux balance analysis model that was generated in KBase.
For more information about Data Landing and Provenance pages, please refer to this page.
In the next section of this guide, we will discuss in more detail how to add data to your Narrative.