How to transfer JGI sequencing data to KBase

With a few clicks, DOE’s Joint Genome Institute (JGI) users can transfer genome reads, assemblies, and annotated genomes to KBase’s Narrative Interface for assembly, annotation, metabolic modeling, and more!

Before transferring data from JGI to KBase, you must:


The beta Search tool on the KBase Dashboard is a quick and easy way to copy public genome reads and assemblies from JGI to your KBase staging area. This page describes the process for transferring those and also private data and annotated genomes.

 

JGI’s Globus service lets users select data from the JGI Genome Portal’s sequencing projects and make them transferable to KBase, all in a few simple steps. There are some similarities to the KBase-initiated Globus service, but there are also some differences.

Starting at the JGI Genome Portal, search for the genome or project of interest and click on the download link. Click on the green ‘Agree’ button on the next page to accept the data usage policy.

Near the top of the download page is a link to JGI’s Globus v.2 service. This version works by 1) transferring the entire project to a JGI/Globus staging area and 2) sending you email when it is ready. Previous versions gave users the option to transfer immediately.

Click on the blue Download button and one of two windows will pop up:
1. A window to enter a Globus ID or register for a Globus ID:

2. Or a window saying “There is not enough disk space in the staging area for this data.”

If you are unfortunate enough to get the message about disk space, this is an issue between JGI and Globus (it has nothing to do with KBase), and it is dependent on other users staging data for transfer. The issue should eventually resolve itself if you try again later.

Once you successfully enter your Globus account name, your data will be transferred to the JGI staging area. You will receive an email from no-reply@genome.jgi.doe.gov with a link for transferring data with Globus. The email also states that JGI only keeps this link active for 14 days. After that time they will remove the data from their staging area to make room for other users.

Clicking on the link in the email will take you to a Globus file transfer window:

The newly created JGI endpoint is filled in on the left.

Set the “To” endpoint (right) to “KBase Bulk Share”, which is the KBase staging area. If you don’t see a directory with your name, add /username (i.e. /psdehal for the user below) to the Path for the KBase endpoint.

You should then be able to drag data files or folders from the left endpoint to the right one, inside the Globus interface, and the files will be copied to your KBase staging area.

If the connection to the server is broken during transfer, Globus will start up again where it left off once the connection is reestablished. The link in the JGI email may be used to reestablish the connection.

While it may be possible to drag ALL the project data from the JGI endpoint to the KBase endpoint, not all files can be imported into KBase. Currently, only reads files (.fastq or .fastq.gz), assemblies (.fna or .fasta), and annotated genomes (.gbk or .gff) have a corresponding KBase object.

You can import data from your staging area to any of your KBase Narratives (see this page for more information). The imported objects are permanent in your KBase account, but the files in the staging area will be removed after 90 days.

Important differences between JGI and KBase data staging:
1. The JGI staging area keeps file links for 14 days. The KBase staging area keeps files for 90 days.
2. The JGI staging area has a limited amount of space that is shared with other users. The KBase staging area has 2.8 terabytes available to the user.
3. The JGI notifies users by email when files have been added to the staging area. The KBase Import tab shows files in your staging area and has an update button to refresh the list.
4. JGI stages an entire project at one time. KBase stages specific files selected by the user.

This page has more information about how to add data to your KBase account so you can analyze it.