Transfer to JGI – Print

Transfer JGI Data User Guide

­Push Data to KBase from JGI Genome Portal

This guide describes how to select high-throughput sequencing data in FASTQ format from the JGI Genome Portal and transfer it to a KBase Narrative for analysis. An analogous process can be used to transfer FASTA format genome assemblies.
Fig 3-13 Push-to-kbase-v2

The U.S. Department of Energy Joint Genome Institute (JGI) combines integrated high-throughput sequencing and computational analysis to enable systems-based scientific approaches to clean energy generation and environmental research. The Department of Energy Genome Science program funds both KBase and JGI.

The JGI Genome Portal lets users search, download, and explore multiple datasets available for DOE JGI sequencing projects including their status, assemblies, and annotations of sequenced genomes. KBase, the DOE Systems Biology Knowledgebase, integrates diverse types of data, analysis tools, and a sophisticated new user interface that enables users to perform complex computational biology analyses and share their workflows and conclusions as reproducible “Narratives.” Combining the resources of JGI with the analytical power of KBase provides extensive research capabilities to the systems biology community.

Before transferring data from JGI to KBase, a user must:

Access the JGI Genome Portal

The JGI Genome Portal supports advances in biological and environmental sciences by providing unified access to all JGI genomic databases and analytical tools. From this portal, the user can browse for and select datasets to transfer to KBase.

Go to the JGI Genome Portal home page (http://genome.jgi.doe.gov/) and sign in with your JGI account using the “Login” link a the top of the page. Once logged in, your user name will appear in place of the “Login” button. If any errors occur while attempting to sign into JGI, try clearing your browser’s cache before signing in again. If this does not work, contact JGI for further assistance and troubleshooting (http://jgi.doe.gov/contact-us/).

Genome Portal

Filter JGI Data to Push to KBase

From the Genome Portal page, select the “Advanced Search” link at the top of the page.

JGI-1

From the Genome Portal page, select the “Advanced Search” link at the top of the page.

In the JGI Genome Portal, the area under the search bar contains data filtering tools that enable the user to narrow the displayed results to the selected criteria. Several pull-down menus allow you to select the data you want to use. To filter the results for data that can be pushed to KBase, select “Microbial” under the Scientific Program dropdown, and click “Show All.”

JGI-2

You can also search for a particular organism of interest, for example Chryseolinea serpens DSM 24574 shown in the table below:

JGI-3

Currently, Push to KBase functionality is supported for Microbial reads and genome assemblies from these Product listings on JGI:

  • Microbial Annotated Finished, Isolate
  • Microbial Annotated Improved Draft, Isolate
  • Microbial Annotated Minimal Draft, Isolate
  • Microbial Annotated Minimal Draft, Single Cell

Push Microbial Reads to KBase

Now that the results have been filtered, you can select microbial reads to transfer to KBase for analysis. Search through the results to identify an organism of interest. Once an organism of interest has been identified:

  • Click the “Download” link under the Resources column

JGI-4

 

  • Read the JGI Data Usage Policy that appears after entering the download area. This statement varies for different organisms, so be sure to read carefully. After agreeing to the policy, click the “Ok” button to continue to the download page.
  • The download page contains the data associated with an organism, organized into folders that reflect the quality of the data and the media on which it is stored within JGI.

Legend

  • Open the folder labeled QC Filtered Raw Data to access the quality controlled FASTQ reads file associated with this organism, or open the Raw Data folder for the unfiltered FASTQ file.
  • Notice that the file name for the QC Filtered Raw Data FASTQ file has a string of text located before the “.fastq.gz” file extension. This string of text refers to a set of filtering parameters used in the quality control processing of the genomic data. To access the legend at right, click the blue box titled “Legend” on the right side of the of download page.
  • Use the checkboxes to the left of the folders and files to select the data to push to KBase. For read data, only those uploaded after July 2012 can currently be pushed to KBase.
  • Select the desired files and click the “Push to KBase” button toward the top of the page. This will open a prompt to sign into KBase with your KBase account credentials.

JGI-5

Push Result

  • After signing into KBase from this prompt, JGI will place the data in a queue for pushing to KBase. Once the data has been successfully pushed, you will receive email with a link to access the data and place it into an existing Narrative.

Email

  • Following this link opens a page within KBase that allows the user to either copy the JGI data into an existing Narrative or launch the Assemble and Annotate Microbial Genome app with this data preloaded. Please see the Assemble and Annotate Microbial Genome tutorial for more information about this app.

Transfer Screen

Alternatively, you can access the data you transferred within the Data Panel:

  • Create a new Narrative or open an existing Narrative.
  • The dataset you transferred can be seen in the Narrative Data Panel under the “Shared With Me” tab. The data you have most recently transferred from JGI will be first on the list. With your JGI data in KBase, you can perform other analyses on it and save your results and commentary to share with collaborators.

JGI-GIF

Completing these steps pushes data from the JGI Genome Portal into a KBase Narrative for further analysis. In the future, users will be able to push more data types from JGI to KBase.

If you have any questions about pushing data to KBase from the JGI Genome Portal, please contact KBase support.

Push Microbial Assemblies to KBase

You can also transfer genomes assemblies from JGI  to KBase for analysis. Search through the results to identify an organism of interest. Once an organism of interest has been identified:

  • Click the “Download” link under the Resources column

JGI-4

  • Read the JGI Data Usage Policy that appears after entering the download area. This statement varies for different organisms, so be sure to read carefully. After agreeing to the policy, click the “Ok” button to continue to the download page.
  • Open the folder labeled QC and Genome Assembly to access the quality controlled assembly file associated with this organism.
  • Notice that the file name for the assembly file is listed as submission.assembly.fasta and is at the bottom of the list of files.
  • Use the checkboxes to the left of the folders and files to select the assembly file to push to KBase. For assemblies, only those uploaded after December 2013 can currently be pushed to KBase.
  • Select the desired files and click the “Push to KBase” button toward the top of the page. This will open a prompt to sign into KBase with your KBase account credentials.

JGI Select Assembly

Push Email

  • After signing into KBase from this prompt, JGI will place the data in a queue for pushing to KBase. Once the data has been successfully pushed, you will receive email with a link to access the data and place it into an existing Narrative.

Push Email

  • Following this link opens a page within KBase that allows the user to either copy the JGI data into an existing Narrative or launch the Convert Assembly File to Contigs app with this data preloaded.

JGI Data Import Assembly Page

Alternatively, you can access the data you transferred within the Data Panel:

  • Create a new Narrative or open an existing Narrative.
  • The dataset you transferred can be seen in the Narrative Data Panel under the “Shared With Me” tab. The data you have most recently transferred from JGI will be first on the list. With your JGI data in KBase, you can perform other analyses on it and save your results and commentary to share with collaborators.

JGI-GIF

Completing these steps pushes data from the JGI Genome Portal into a KBase Narrative for further analysis. In the future, users will be able to push more data types from JGI to KBase.

If you have any questions about pushing data to KBase from the JGI Genome Portal, please contact KBase support.