Data Policy & Sources

Data Policies

KBase conforms to the Information and Data Sharing Policy of the Genomic Science Program of the Office of Biological and Environmental Research within the Office of Science. This requires that all publishable data, metadata, and software resulting from research funded by the Genomic Science program must conform to community-recognized standard formats when they exist; be clearly attributable; and be deposited within a community-recognized public database(s) appropriate for the research.

Data publicly available in KBase comes from the sources listed on this page. Additionally, users can upload their own data to KBase to analyze it, and can choose how widely their data should be shared. (All data uploaded by users is private to them unless they choose to share it.)


KBase does not guarantee long-term retention of user-uploaded data. Please take appropriate precautions in storing and backing up your data locally.


Improper use of KBase, including uploading human data, may result in the termination of KBase access privileges. Please see the Terms and Conditions page for more information.

Data Sources

Source License Download
Genomic Data
NCBI RefSeq Public Domain –- US Government FTP
PATRIC Public Domain –- US Government FTP
SEED Public Domain FTP
Phytozome Free (we don’t include embargoed/early release genomes) Download (login required)
Gramene Public Domain FTP
JGI Public Domain –- US Government HTTP
Ribosomal Data
Greengenes Creative Commons Attribution-ShareAlike 3.0 Unported License HTTP
SILVA Academic/non-commercial HTTP
RDP Creative Commons Attribution-ShareAlike 3.0 Unported License HTTP
GO No license HTTP
GSC Unknown HTTP
Pathway Data
KEGG maps Academic license required HTTP
ModelSEED Public Domain –- US Government HTTP
Protein Annotations
UniProt Public Domain HTTP
RAST Public Domain FTP