Metrics

Auth-O-Tron exposes Prometheus-compatible metrics on a dedicated port for monitoring and observability, following the ECMWF Codex Observability guidelines.

Endpoint

Metrics are served at:

GET /metrics

By default, this endpoint is available on port 9090. The format follows the Prometheus text exposition format.

Metric Families

Metric Name	Type	Labels	Description
`auth_requests_total`	Counter	result, realm	Total authentication requests
`auth_duration_seconds`	Histogram	result, realm	Authentication latency distribution
`auth_provider_attempts_total`	Counter	provider_name, provider_type, realm, result	Attempts per authentication provider
`auth_provider_duration_seconds`	Histogram	provider_name, provider_type, realm	Provider-specific latency
`augmenter_attempts_total`	Counter	augmenter_name, augmenter_type, realm, result	Token augmentation attempts
`augmenter_duration_seconds`	Histogram	augmenter_type, realm	Augmentation latency

Label Values

result (auth_requests_total): success, no_auth_header, invalid_header, all_failed

result (auth_provider_attempts_total): success, error, timeout

result (augmenter_attempts_total): success, error

realm: The configured authentication realm name, or unknown when the request fails before realm resolution

provider_name: Identifier for the authentication provider

provider_type: Type of provider (plain, jwt, ecmwf-api, efas-api, openid-offline, ecmwf-token-generator)

augmenter_name: Identifier for the token augmenter

augmenter_type: Type of augmenter

PromQL Examples

Request Rate

rate(auth_requests_total[5m])

Authentication Latency (99th Percentile)

histogram_quantile(0.99, rate(auth_duration_seconds_bucket[5m]))

Error Rate by Realm

rate(auth_requests_total{result!="success"}[5m])

Slow Providers (95th Percentile)

histogram_quantile(0.95, rate(auth_provider_duration_seconds_bucket[5m]))

Provider Success Rate

rate(auth_provider_attempts_total{result="success"}[5m])

Alerting Recommendations

Consider alerting on:

High error rates: rate(auth_requests_total{result!="success"}[5m]) > 0.1
Elevated latency: histogram_quantile(0.95, rate(auth_duration_seconds_bucket[5m])) > 1.0
Provider failures: rate(auth_provider_attempts_total{result=~"error|timeout"}[5m]) > 0

Keyboard shortcuts

Auth-O-Tron