Overview
The various module services each serve different kinds of "users":
- ocrd-logview is an administrator tool showing live Docker logs for all containers.
- ocrd-controller is (currently) a dedicated SSH server
allowing a user
ocrd
to log in and run shell scripts which involve OCR-D workflows composed of various OCR-D processor calls.
(An external instance will behave the same, without being controlled as a service here.) - ocrd-manager is (currently) a generic SSH server
allowing a user
ocrd
to log in and run shell scripts which involve OCR-D tasks (that will usually delegate toocrd-controller
), notably - kitodo-app is an instance of
Kitodo.Production with some OCR-D specific,
optional extensions,
and some example data
which aids in our kick-start demonstration
(An external instance will behave the same, without being controlled as a service here – but with your actual data, and probably without the extensions.) - ocrd-monitor provides a webserver for monitoring jobs and logs, to inspect results and workflows, and customise+rerun workflows.
Thus, only the latter two could be considered for "end users".
Data
Depending on how exactly you set up your module services, they will be configured to use a number of specific filesystem paths from your host system as persistent volumes inside the service containers. What follows is a description of the configuration variables, along with their respective default values and an explanation of their role and lifetime.
(Again, if you have disabled said modules, then the same applies, but to their respective remote hosts.)
CONTROLLER_MODELS=./ocrd/controller/models
:$CONTROLLER_MODELS/ocrd-resources
: persistent storage directory for processor resources
CONTROLLER_CONFIG=./ocrd/controller/config
:$CONTROLLER_CONFIG/ocrd/resources.yml
: persistent database for processor resources
CONTROLLER_DATA=./ocrd/controller/data
:$CONTROLLER_DATA/KitodoJob*
: temporary storage for OCR-D workspaces during OCR processing
(all images and METS will be copied here; gets filled with OCR results; to be removed after the job is done)
MANAGER_DATA=./kitodo/data/metadata
:$MANAGER_DATA/ocr-d/*
: transient storage for OCR-D workspaces between first OCR request and final success
(all images will be copied – cloned/CoW/reflink if possible – here; METS is created here; OCR results will be copied here; can be re-used if the OCR job failed, e.g. by re-entering with a different workflow; to be removed sometime after the OCR job was successful and no user interaction followed)$MANAGER_DATA/*
: directories for input (images or METS) and output (ALTO) files
(shared with Kitodo.Production or Kitodo.Presentation; to be removed by caller)
MONITOR_DATA=./kitodo/data/metadata
: same asMANAGER_DATA
, shared for data browsingAPP_DATA=./kitodo/data
:$APP_DATA/metadata/*
: same asMANAGER_DATA
, shared for i/o