.. _typedobjects: Workspace typed objects ======================= The Workspace Service (WSS) provides storage, sharing, versioning, validation and provenance tracking of typed object (TO) data. This document describes basic information for developers who need to define and register TOs for use with the WSS. Typed object basics ------------------- TOs in the WSS are hierarchical data objects that conform to type definitions specified in the KBase Interface Description Language (KIDL). Just as KIDL is used to specify the structure of data exchanged between KBase clients and servers (generated by the Type Compiler), KIDL is used to define the structure of data stored in the WSS. Any structures defined in a KIDL formatted file (e.g. ``typedef structure { … } StructureName``;) can be registered with the WSS (see :ref:`typedobjectregandver`). Instances of these objects can then be saved to the WSS by any user. The WSS does not support storage of primitive or basic container types directly (ie string, int, float, list, mapping). KIDL defined Modules provides namespacing for typed objects. Thus, the module name and type name is required to uniquely identify a type in the WSS, generally in the format ``ModuleName.TypeName``. .. _typedobjectregandver: Typed object registration & versioning -------------------------------------- TO definitions must be registered with the WSS before instances of the TOs can be saved. The basic process for registering a TO is: * Developer requests ownership of a module name via the Workspace API * see API method ``request_module_ownership(...)`` * WSS admin approves the request * Developer uploads (i.e. registers) a type specification file (typespec) in KIDL format where the module name is identical to the just approved module name in the WSS and indicates the names of the TOs which the developer wants the WSS to support * see API method ``register_typespec(...)`` * Developer releases the module, which releases the latest version of all TO definitions in the module * see API method ``release_module(...)`` TO definitions marked for WSS usage are versioned with a major and minor version. Every time a new typespec is uploaded and registered, the TO definitions defined in the module automatically receive a new version number if changed. Minor versions are incremented if the change is backwards compatible (i.e. addition of a new optional field). Major versions are incremented if the change is not backwards compatible. All versions of all registered TO definitions are available to WSS users, but to save an object instance of an old version, or an unreleased version, the exact version number must be provided by the user. If a WSS user saves an object instance without providing version numbers for the type, the latest released version of the TO definition is assumed. The process of releasing a module therefore indicates that the latest version of all typed object definitions in the module are ready for public use, but does not limit user’s or developer’s ability to work with old or pre-released versions of TOs. Before the first release of a module, repeated uploads of a module result in version numbers of TO definitions of 0.x and are assumed to be backwards incompatible. On first release of a module, all version numbers of TO definitions are updated to 1.0. Users and developers can use the ``ws-typespec-list`` script or the API to list registered modules, type definitions, and versions of type definitions, and to retrieve the actual KIDL or JSON schema encoding of the typed object definition. End users will only be able to view the versions of TOs that are released. Owners of a module can list all versions of TOs in modules that they own. Typed object validation ----------------------- Instances of TOs can be validated against type definitions registered with the WSS. Instances of TOs must pass this validation process to be stored in the WSS, thereby guaranteeing that WSS data is structurally valid. .. todo:: Update this document to use the kb-sdk tools. The WSS validates the TO instance against a `JSON Schema V4 `_ encoding of the TO definition. The JSON Schema encoding can be generated by the KBase Type Compiler (currently in branch ``dev-prototypes``). In addition to matching the structure and type of data, additional constraints can be placed on TO validation through the use of Annotations (see :ref:`typedobjectannotations`). To generate JSON encodings of your TOs for review, checkout the ``dev-prototypes`` branch of the typecompiler and compile your typespec file with the ``--jsonschema`` option of the ``compile_typespec`` command. The JSON Schema encoding of each object definition is generated in the output location in a directory called jsonschema. The JSON Schema encoding is also available for all registered TO definitions via the WSS API or the ``ws-typespec-list`` command. All TO instances pulled from the WSS are guaranteed to be valid instances of a TO definition as registered with the WSS. Therefore it is recommended that KBase services which require rigorous validation of complex data operate on data stored in the WSS (as opposed to passing the object by value and writing the validation code yourself). Note that full validation is not built into generated KBase client/server code, so it is not safe to assume that input data received directly from a type compiler generated client conforms to the specified type definitions in your API. Additional technical details: The TO validation code is written in Java and is available in the `workspace_deluxe KBase repo `_. .. _typedobjectannotations: Typed object annotations ------------------------ Annotations provide an infrastructure for attaching structured meta data to type definitions (and eventually to functions and modules). Such meta data is useful for specifying additional constraints on data types, interpreting data types within a particular context, and declaring structured information that can later be automatically indexed or searched, such as authorship of a function implementation. Annotations are declared in the comment immediately preceding the definition of the TO. Thus, all annotations are always attached and viewable within the API documentation. Each annotation must be specified on its own line in the following format:: @[ANNOTATION] [INFO] where ``[ANNOTATION]`` is the name of the annotation and ``[INFO]`` is any additional information, if any, required of the annotation. To provide a simple example which associates authorship information to a TO using the ``@author`` annotation:: /* Data type for my experimental data. @author John Scientist */ typedef structure { string name; list results; } MyExperimentData; Currently supported type definition annotations ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Optional annotation """"""""""""""""""" Mark a specific field of a structure as an optional field. The optional annotation can only be declared where a structure is first defined. On validation of TO instances by the WSS, missing optional fields are permitted. If an optional field is present, however, the value of the field will be validated normally. Optional fields are defined as:: @optional [FIELD_NAME_1] [FIELD_NAME_2] ... For example, the following annotation will declare that two fields of the structure are optional.:: /* @optional alias functional_assignments */ typedef structure { string name; string alias; string sequence; list functional_assignments; } Feature; .. _idannotations: ID annotations """""""""""""" Mark a string as an ID that references another object or entity. ID annotations can only be associated to type definitions which resolve to a string. ID annotations are declared in the general form:: @id [ID_TYPE] [PARAMETERS] where ``[ID_TYPE]`` specifies the type of ID and is required, and ``[PARAMETERS]`` provides additional information or constraints. ``[PARAMETERS]`` are always optional. ID annotations are inherited when declaring a new ``typedef`` of a string that was already marked as an ID. If a new ID Annotation is declared in a ``typedef``, it overrides any previous ID declaration. Note that although ``@id`` annotations may be specified as any ``ID_TYPE`` and associated to any ``typedef``, applications that consume type specifications (primarily the workspace at the time of writing) may only recognize specific ``@id`` ``ID_TYPE`` / ``typedef`` combinations. The ID types currently supported are described below. **Workspace ID** :: @id ws [TYPEDEF_NAME] ... The ID must reference a TO instance stored in the WSS. There are multiple valid ways to specify a workspace object, and all are acceptable. A reference path into the object graph may be provided by providing a string consisting of a list of references separated by semicolons. Optionally, one or more type definition names can be specified indicating that the ID must point to a TO instance that is one of the specified types. The typedef with which the ``@id`` annotation is associated must be a string. Example:: /* A reference to a genome. @id ws KB.MicrobialGenome KB.PlantGenome */ typedef string genome_id; **KBase ID** :: @id kb This annotation originally specified that the string must be a KBase ID which was typically registered in the `ID service `_ in a format such as “kb|type.XXX”. The ID server is no longer used in KBase and this field doesn't have any particular meaning at this point. No type checking on this field is performed, but the annotation may be used in the future so that users can automatically extract KBase IDs from typed objects. **Handle ID** :: @id handle The ID must reference a Handle ID from the `Handle Service `_. This is typically in the format KBH_XXX. When saving an object containing one or more handles to the WSS, the WSS checks that the handles are owned by the user before completing the save. Furthermore, the handle data is shared as the workspace object is shared. See :ref:`shockintegration` for more details. **Shock ID** :: @id bytestream The ID must reference a `Shock `_ node that exists in the Shock instance configured for linking Shock nodes to WSS objects. When saving an object containing one or more Shock nodes to the WSS, the WSS checks that the nodes are owned by the user or owned by the workspace and readable by the user and (if necessary) takes ownership of the nodes. Furthermore, the nodes are shared as the workspace object is shared. See :ref:`shockintegration` for more details. **Sample ID** :: @id sample The ID must reference a `Sample service `_ sample. When saving an object containing one or more sample IDs to the WSS, the WSS checks that the samples are administrated by the user. Furthermore, the nodes are shared as the workspace object is shared. See :ref:`sampleserviceintegration` for more details. **External ID** :: @id external [SOURCE] ... The ID must reference an entity in an external (i.e. outside of KBase) data store. The IDs are not verified or validated, but may be used in the future to allow users to automatically extract external IDs from typed objects. ``[SOURCE]`` provides an optional way to specify the external source. Currently there is no standard dictionary of sources. Deprecated annotation """"""""""""""""""""" :: @deprecated [REPLACEMENT_TYPE] The deprecated annotation is used to mark a type definition as deprecated, and provides a structured mechanism for indicating a replacement type if one exists. The deprecated annotation so far is only for documentation purposes, but may be used by the Workspace in the future to better display, list, or query workspace objects (e.g. list all objects of a type that is not deprecated). Range annotation """""""""""""""" :: @range [RANGE SPECIFICATION] The range annotation is associated with either a float or int typedef and specifies the minimum and / or maximum value of the int or float. The ``[RANGE SPECIFICATION]`` is a tuple of the minimum and maximum numbers, separated by a comma. Omit the minimum or maximum to specify an infinite negative or positive range, respectively. Bracketing the ``[RANGE SPECIFICATION]`` with parentheses indicates the range extents are exclusive; square brackets or no brackets indicates an inclusive range. Examples: ======= ============================================= Range Explanation ======= ============================================= 0, 30 Range from 0 - 30, inclusive [0, 30] Range from 0 - 30, inclusive [0, 30) Range from 0 - 30, including 0, excluding 30 (0, Range from 0 - +inf, excluding 0 ,30] Range from -inf - 30, including 30 ======= ============================================= Example specification:: /* @range -4.5,7.6) */ typedef float my_float; /* @range [2,10] */ typedef int my_int; Metadata annotation """"""""""""""""""" :: @metadata [CONTEXT] [ACTION] [as NAME] The metadata annotation specifies data that an application should extract from a TO as metadata about the TO. Typically this metadata is very small compared to the TO and is therefore suitable for use when only a summary of the TO is necessary for an operation. As of this writing, the WSS uses the annotation to automatically generate user metadata for a TO. The metadata annotation may only be associated with ``structure`` ``typedef`` s. Metadata annotations on nested ``structure`` s are ignored. ``[CONTEXT]`` specifies where the metadata annotation is applicable. In the case of the WSS, the ``[CONTEXT]`` is ``ws``. ``[CONTEXT]`` is always required. ``[ACTION]`` specifies what metadata should be extracted and any operations to perform on said metadata. At minimum, the ``[ACTION]`` must provide the path (dot separated) to the item of interest. Note that the path may only proceed through ``structure`` ``typedef`` s, not ``mapping`` s or ``list`` s. A bare path must terminate at a primitive type - either a ``string``, ``int``, or ``float``. ``[ACTION]`` s may also specify a function to apply to the item specified by the path. Currently, the only available function is ``length()``, which may be applied to ``list`` s, ``mapping`` s, ``tuple`` s, and ``string`` s. ``length()`` returns the number of items in a ``list``, ``mapping``, or ``tuple``, or the length of a ``string``. ``[as NAME]`` allows specifying an optional ``NAME`` for the extracted metadata. If a ``NAME`` is not provided, the application will use the ``[ACTION]`` string as the metadata name. The ``NAME`` is entirety of the remainder of the line after "as". Example:: /* Nested structure, metadata annotations have no effect here Cannot provide a path into the mapping in a metadata annotation */ typedef structure { mapping strmap; int an_int; } InnerStruct; /* Specifies the metadata ("str" -> value of str in TO) @metadata ws str Specifies the metadata ("my rad string" -> value of str in TO) @metadata ws str as my rad string Specifies the metadata ("inner.an_int" -> value of inner.an_int in TO) @metadata ws inner.an_int Specifies the metadata ("length(str)" -> length of str in TO) @metadata ws length(str) Specifies the metadata ("num strings" -> # of items in inner.strmap) @metadata ws length(inner.strmap) as num strings Note that metadata paths cannot enter outerstrmap. */ typedef structure { InnerStruct inner; string str; mapping outerstrmap; } MyStruct;