Skip to main content

Temporal Cloud Observability and Metrics

Temporal offers two distinct sources of metrics: Cloud/Server Metrics and SDK Metrics. Each source provides options for levels of granularity and filtering, monitoring-tool integrations, and configuration. Before implementing Temporal Cloud observability, decide what you need to measure for your use case. There are two primary use cases for metrics:

  • To measure the health and performance of Temporal-backed applications and key business processes.
  • To measure the health and performance of Temporal infrastructure and user provided infrastructure in the form of Temporal Workers and Temporal Clients.

When measuring the performance of Temporal-backed applications and key business processes, you should rely on Temporal SDK metrics as a source of truth. This is because Temporal SDKs provide visibility from the perspective of your application, not from the perspective of the Temporal Service.

SDK metrics monitor individual workers and your code's behavior. Cloud metrics monitor Temporal behavior. When used together, Temporal Cloud and SDK metrics measure the health and performance of your full Temporal infrastructure, including the Temporal Cloud Service and user-supplied Temporal Workers.

Use the following rule of thumb when deciding which signal to rely on:

QuestionPrimary signal
Is Temporal Cloud accepting and serving work normally?Cloud metrics
Are Tasks backing up in a Task Queue?Cloud metrics plus SDK Schedule-To-Start metrics
Are my Workers saturated, under-provisioned, or misconfigured?SDK metrics
Is my application logic, downstream dependency, or Activity behavior unhealthy?SDK metrics and traces

For a Worker-focused view of how to combine these signals, see Monitor worker health.

Cloud Metrics for all Namespaces in your account are available from two sources:

note

OpenMetrics is the recommended option for most users.

For setting up SDK metrics emitted by your Workers and Clients, see SDK metrics setup.