Metrics¶
The services of PLOSSYS 5 expose several custom metrics that provide insights into their operation and performance and can help to identify any potential issues. These metrics are exposed at the /metrics
endpoint of each service and can be scraped by Prometheus. You can visualize them by using tools like Grafana to create dashboards that provide a real-time view of the system's performance.
Following is a brief overview of the custom metrics for each service. For in depth information, refer to Custom Metrics JSON below.
For more information about the overall health of your servers, refer to Default Metrics.
Custom Service Metrics¶
seal-ipp-checkin¶
The seal-ipp-checkin
is the entry point for IPP jobs. It has custom metrics about filesize and checkin time to spot irregularities in processing new jobs.
-
p5_job_state
: This gauge metric represents the states of jobs per service. It has a labelstate
which indicates the current state of the job. -
p5_job_arrived
: This counter metric represents the number of arrived PLOSSYS 5 jobs. -
p5_job_accepted
: This counter metric represents the number of accepted PLOSSYS 5 jobs. -
p5_job_received
: This counter metric represents the number of received PLOSSYS 5 jobs. -
p5_job_filesize_bytes
: This is a histogram metric for the filesize of incoming jobs. It has predefined buckets for different file sizes. -
p5_job_received_accepted_milliseconds
: This is a histogram metric for the time between received and accepted jobs. It has predefined buckets for different time durations.
seal-convert-dispatcher¶
The seal-convert-dispatcher
dispatches document conversions to other convert services. It exposes the job state to allow the monitoring of the preprocessing rate and the p5_printer_locks
metric to monitor which services currently own the lock for a specific printer.
-
p5_job_state
: This gauge metric represents the states of jobs per service. It has a labelstate
which indicates the current state of the job. -
p5_printer_locks
: This gauge metric represents the number of printer locks per service.
seal-convert-controller¶
The seal-convert-controller
handles job and printer statuses and dispatches print jobs via streaming to follow-up services. In addition to p5_job_state
and p5_printer_locks
, it exposes p5_printer_queue_length
to allow monitoring of the current state of queues in the system.
-
p5_job_state
: This gauge metric represents the states of jobs per service. It has a labelstate
which indicates the current state of the job. -
p5_printer_locks
: This gauge metric represents the number of printer locks per service. -
p5_printer_queue_length
: This gauge metric represents the length of the printer queue. It has a labelprintername
which indicates the name of the printer.
seal-co-notifier¶
The seal-co-notifier
sends job status notifications to SAP and IPP backend systems. It exposes the p5_sap_notifications
metric to allow monitoring of successfully sent notifications to SAP systems.
p5_sap_notifications_total
: This counter metric represents the total number of successfully sent SAP notifications. It has a labeldestination
which indicates the destination of the notification.
seal-copier¶
The seal-copier
is responsible for creating copies for printers that are not able to do so. It exposes the p5_copies_total
metric to allow monitoring of the total number of copies.
p5_copies_total
: This counter metric represents the total number of copies.
Custom Metrics JSON¶
[
{
"service": "seal-ipp-checkin",
"metricsUrl": "https://<host>:632/metrics",
"metrics": [
{
"name": "p5_job_state",
"help": "Gauge for job states per service",
"type": "gauge",
"labels": [
"state"
]
},
{
"name": "p5_job_arrived",
"help": "Counter for arrived P5 jobs",
"type": "counter"
},
{
"name": "p5_job_accepted",
"help": "Counter for accepted P5 jobs",
"type": "counter"
},
{
"name": "p5_job_received",
"help": "Counter for received P5 jobs",
"type": "counter"
},
{
"name": "p5_job_filesize_bytes",
"help": "Histogram for filesize of incoming jobs",
"type": "histogram",
"buckets": [
"500_000",
"1_000_000",
"5_000_000",
"25_000_000",
"50_000_000",
"100_000_000"
]
},
{
"name": "p5_job_received_accepted_milliseconds",
"help": "Histogram for time between received and accepted",
"type": "histogram",
"buckets":[
"100",
"250",
"500",
"1000",
"2500",
"5000",
"10000"
]
}
]
},
{
"service": "seal-convert-dispatcher",
"metricsUrl": "https://<host>:1958/metrics",
"metrics": [
{
"name": "p5_job_state",
"help": "Gauge for job states per service",
"type": "gauge",
"labels": [
"state"
]
},
{
"name": "p5_printer_locks",
"help": "Gauge for printer locks per service",
"type": "gauge"
}
]
},
{
"service": "seal-controller",
"metricsUrl": "https://<host>:1969/metrics",
"metrics": [
{
"name": "p5_job_state",
"help": "Gauge for job states per service",
"type": "gauge",
"labels": [
"state"
]
},
{
"name": "p5_printer_locks",
"help": "Gauge for printer locks per service",
"type": "gauge"
},
{
"name": "p5_printer_queue_length",
"help": "Gauge for printer queue length",
"type": "gauge",
"labels": [
"printername"
]
}
]
},
{
"service": "seal-co-notifier",
"metricsUrl": "https://<host>:2098/metrics",
"metrics": [
{
"name": "p5_sap_notifications_total",
"help": "Counter for successfully sent SAP notifications",
"type": "counter",
"labels": [
"destination"
]
}
]
},
{
"service": "seal-copier",
"metricsUrl": "https://<host>:2028/metrics",
"metrics": [
{
"name": "p5_copies_total",
"help": "Total number of copies",
"type": "counter"
}
]
}
]
Default Metrics¶
Additionally to the custom metrics provided by the services, you can utilize the standard metrics provided by the Node.js prom-client to measure the overall health of your servers.
[
{
"name": "process_cpu_user_seconds_total",
"help": "Total user CPU time spent in seconds.",
"type": "counter"
},
{
"name": "process_cpu_system_seconds_total",
"help": "Total system CPU time spent in seconds.",
"type": "counter"
},
{
"name": "process_cpu_seconds_total",
"help": "Total user and system CPU time spent in seconds.",
"type": "counter"
},
{
"name": "process_start_time_seconds",
"help": "Start time of the process since unix epoch in seconds.",
"type": "gauge"
},
{
"name": "process_resident_memory_bytes",
"help": "Resident memory size in bytes.",
"type": "gauge"
},
{
"name": "process_virtual_memory_bytes",
"help": "Virtual memory size in bytes.",
"type": "gauge"
},
{
"name": "process_heap_bytes",
"help": "Process heap size in bytes.",
"type": "gauge"
},
{
"name": "process_open_fds",
"help": "Number of open file descriptors.",
"type": "gauge"
},
{
"name": "process_max_fds",
"help": "Maximum number of open file descriptors.",
"type": "gauge"
},
{
"name": "nodejs_eventloop_lag_seconds",
"help": "Lag of event loop in seconds.",
"type": "gauge"
},
{
"name": "nodejs_eventloop_lag_min_seconds",
"help": "The minimum recorded event loop delay.",
"type": "gauge"
},
{
"name": "nodejs_eventloop_lag_max_seconds",
"help": "The maximum recorded event loop delay.",
"type": "gauge"
},
{
"name": "nodejs_eventloop_lag_mean_seconds",
"help": "The mean of the recorded event loop delays.",
"type": "gauge"
},
{
"name": "nodejs_eventloop_lag_stddev_seconds",
"help": "The standard deviation of the recorded event loop delays.",
"type": "gauge"
},
{
"name": "nodejs_eventloop_lag_p50_seconds",
"help": "The 50th percentile of the recorded event loop delays.",
"type": "gauge"
},
{
"name": "nodejs_eventloop_lag_p90_seconds",
"help": "The 90th percentile of the recorded event loop delays.",
"type": "gauge"
},
{
"name": "nodejs_eventloop_lag_p99_seconds",
"help": "The 99th percentile of the recorded event loop delays.",
"type": "gauge"
},
{
"name": "nodejs_active_resources",
"help": "Number of active resources that are currently keeping the event loop alive, grouped by async resource type.",
"type": "gauge",
"labels": [
"type"
]
},
{
"name": "nodejs_active_resources_total",
"help": "Total number of active resources.",
"type": "gauge"
},
{
"name": "nodejs_active_handles",
"help": "Number of active libuv handles grouped by handle type. Every handle type is C++ class name.",
"type": "gauge",
"labels": [
"type"
]
},
{
"name": "nodejs_active_handles_total",
"help": "Total number of active handles.",
"type": "gauge"
},
{
"name": "nodejs_active_requests",
"help": "Number of active libuv requests grouped by request type. Every request type is C++ class name.",
"type": "gauge",
"labels": [
"type"
]
},
{
"name": "nodejs_active_requests_total",
"help": "Total number of active requests.",
"type": "gauge"
},
{
"name": "nodejs_heap_size_total_bytes",
"help": "Process heap size from Node.js in bytes.",
"type": "gauge"
},
{
"name": "nodejs_heap_size_used_bytes",
"help": "Process heap size used from Node.js in bytes.",
"type": "gauge"
},
{
"name": "nodejs_external_memory_bytes",
"help": "Nodejs external memory size in bytes.",
"type": "gauge"
},
{
"name": "nodejs_heap_space_size_total_bytes",
"help": "Process heap space size total from Node.js in bytes.",
"type": "gauge",
"labels": [
"space"
]
},
{
"name": "nodejs_heap_space_size_used_bytes",
"help": "Process heap space size used from Node.js in bytes.",
"type": "gauge",
"labels": [
"space"
]
},
{
"name": "nodejs_heap_space_size_available_bytes",
"help": "Process heap space size available from Node.js in bytes.",
"type": "gauge",
"labels": [
"space"
]
},
{
"name": "nodejs_version_info",
"help": "Node.js version info.",
"type": "gauge",
"labels": [
"version",
"major",
"minor",
"patch",
"release",
"lts"
]
},
{
"name": "nodejs_gc_duration_seconds",
"help": "Garbage collection duration by kind, one of major, minor, incremental or weakcb.",
"type": "histogram",
"labels": [
"kind"
]
},
{
"name": "http_request_duration_seconds",
"help": "duration histogram of http responses labeled with: status_code, method, path",
"type": "histogram",
"labels": [
"status_code",
"method",
"path"
]
},
{
"name": "up",
"help": "1 = up, 0 = not up",
"type": "gauge"
}
]