Workspace resource requirements

Several configuration variables define resources which are assigned to the WSS.

Configuration variables

Variable Default Notes
temp-dir ws_temp_dir Change to an appropriate location.
server-threads 20  
min-memory 10000 In MB.
max-memory 15000 In MB.

temp-dir

temp-dir determines where the workspace writes temporary files. The workspace is by default configured to need no more than 80GB of space at one time (see Disk usage below). The faster the drive on which the temp files directory is located, the faster the workspace will process large TOs.

server-threads

server-threads determines how many threads the server will run, which determines the maximum number of concurrent serviced connections. If more than this number of connections occur at the same time, they will be processed in the order received. server-threads dictates how much memory and disk space is needed for the server as a whole - see Memory usage and Disk usage below.

min-memory and max-memory

min-memory and max-memory set the minimum and maximum memory the Glassfish server, as a whole, will use (e.g. they’re JVM parameters). It is assumed no other services run on the Glassfish server.

Memory usage

The workspace currently uses up to 400MB per call for saving data:

Amount Use If exceeded
100MB Storage of the raw rpc data as bytes. RPC call is dumped to disk.
100MB Storage of sorted, relabeled TOs All TO data is dumped to disk.
200MB Memory for sorting & intermediate data per TO (processed serially). See Notes on sorting. Intermediate data is dumped to disk or an error is returned if sorting takes too much memory.

Returning data is simpler - 300MB is allocated for all TO data, and any TO data exceeding this limit is dumped to disk.

Provenance and user provided metadata are not included in these limits but are expected to be small.

Thus, to be safe the minimum memory for the server should be set to 500MB per thread (thus the default 10GB for a 20 thread server).

Note

In the future, we hope to add a thread queue that detects free memory so that more threads can run when the memory load is not high (which is expected to be the case most of the time).

Disk usage

Disk usage is currently configured to use up to 3GB per call for saving data.

Amount Use If exceeded
1GB Storage of the raw rpc data as bytes. The server throws an error.
1GB Storage of sorted, relabeled TOs The server throws an error.
1GB Storage of intermediate sort files The server throws an error.

Returning data is configured to use no more than 2GB. Thus, to be perfectly safe, 4GB per server thread of temporary disk space should be allocated (thus 80GB for a 20 thread server).