Create a text info file based on a GenomeSet object.
This intended purpose of this App is to produce a downloadable TEXT file about a GenomeSet object.
Most KBase data objects already have some type of viewable HTML table, either as output from an App or by dragging the object onto the Narrative. This App serves a different purpose. It creates data files that are downloadable and used with local tools such as Excel or informatic scripts.
This App creates a file for each Genome in a GenomeSet. A Summary section has a preview of the full output. It is intended to be a preview of the downloadable file. If the file is tab or comma delimited, it may appear misaligned on the screen but readable by a computer script. The HTML link opens a new tab with the full output. The link for downloading the files is in the Files section of the output.
Inputs:
- The KBase GenomeSet object.
- A download option:
- A list of genomes. Both tab-delimited and comma-delimited will be created.
- Protein coding Features, Tab-delimited and comma-delimited. A file where the rows are the genes in the Genomes and the columns are information about the genes. Columns are separated with tabs or commas.
- Features in GFF3 format. A GFF3 (General Feature Format) file of the protein coding features in the Genome.
- Genomes in genBank (gbk) format. A .gbk formatted file for each Genome.
- FASTA, translated CDSs. A FASTA-formatted file of the amino acid sequences of the protein coding features.
- FASTA, of mRNAs. A FASTA-formatted file of the mRNA sequences of the protein coding features.
- DNA Fasta. A file with the DNA FASTA of each of the Genomes. NOTE: this is really long.
Output:
- The Summary or Link section has a text version of the requested file.
- One or more downloadable files in the Files section. The name and content of the file will depend on the requested file format.
- The list of genomes in tab-delimited (.tsv) and comma-delimited (.csv) format.
- FASTA files will be in FASTA format. The file name will end in .fna for nucleotide files and .faa for amino acid files.
- The other files contain summary statistics about the genomes in the set. The differences are just in the format of the information.
Related Publications
- Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163 , https://www.nature.com/articles/nbt.4163
App Specification:
https://github.com/kbaseapps/kb_ObjectInfo/tree/f41cd0b3c9767eecc436a8474806d8c639ad3f8a/ui/narrative/methods/genomeset_reportModule Commit: f41cd0b3c9767eecc436a8474806d8c639ad3f8a