Skip to main content

Azure Logic Apps Monitoring with OpenTelemetry

Why Scout for Azure Logic Apps observability?

Microsoft now emits OTel-shaped telemetry by default through the Azure Monitor OpenTelemetry Distro and SDK 3.x. Scout consumes those exact signals via OTLP and stores them alongside your AWS, GCP, on-prem, and application telemetry in one OTel-native query surface.

Azure Monitor remains the data source for the metrics in this guide - the receiver reads from it. What changes is the destination: Scout instead of Application Insights / Log Analytics for visualization, alerting, and long-term query.

Overview

Logic Apps Consumption is a single Microsoft-managed workflow resource (Microsoft.Logic/workflows) billed per execution. End-to-end observability means the run / action / trigger lifecycle and the billable-execution count from the control plane, plus the per-action detail that aggregate metrics cannot reach.

This guide configures the azure_monitor receiver for control-plane metrics and the azure_event_hub receiver for the WorkflowRuntime Diagnostic Settings log category.

Instrumentation paths for Logic Apps

Consumption exposes exactly two telemetry paths. Pick one or both based on the table below.

PathWhat it coversWhat it costsSetup
Platform metrics - Azure Monitor (this guide)Run / action / trigger lifecycle counts, run / action / trigger latency, and billable executions, aggregated per workflow at minute grain. Answers most "is my workflow healthy and what is it costing me?" questions on its own. Does not see which action inside a run failed.Azure Monitor query cost: one query per metric per scrape (or one per resource with use_batch_api: true). At a 60s interval the daily cost runs in cents per workflow.One Service Principal with Monitoring Reader on the resource group; one receiver block; one resource processor.
Resource logs - Diagnostic Settings → Event Hubs (this guide, §Logs)Per-action drill-down: which action ran, its status and error code, per-trigger payload attribution, per-run forensic ordering. Optionally the full trigger / action inputs and outputs. The detail surface metrics structurally cannot provide.One Event Hubs Basic namespace (~$11/mo at 1 TU; 1 MB/s ingress absorbs far more than a typical workflow's record rate). The Diagnostic Setting itself is free.One Diagnostic Setting on the workflow with the WorkflowRuntime category; one Event Hubs namespace + hub + Send/Listen SAS rules; one azure_event_hub receiver fragment.

There is no third (in-system scrape) path. Logic Apps Consumption's runtime is fully Microsoft-managed with no data-plane endpoint to scrape. The in-database or in-broker pull pattern used for PostgreSQL or Redis does not apply here. If you arrived expecting a "scrape the engine" option, that is why it is absent. Standard hosting (Microsoft.Web/sites with kind=workflowapp) exposes App Service Plan signals instead; see Azure App Service.

Which path to pick

Four decision criteria, in order of usual weight:

  1. Consumption or Standard hosting? This guide is Consumption (Microsoft.Logic/workflows). Standard emits Microsoft.Web/sites-shaped metrics plus App Service Plan saturation signals - a different namespace and a different receiver scope. Confirm which hosting model your workflow uses before configuring the receiver; mixing them silently yields an empty metric set.
  2. Is cost attribution your priority? TotalBillableExecutions is the single most actionable Consumption metric - Consumption bills per billable execution. If cost control is the whole reason you are here, the platform-metrics path alone is sufficient; you do not need logs.
  3. Do you need to know which action failed? Surface metrics aggregate. ActionsFailed tells you an action in some run failed, not which action, in which run, with what error. Debugging failed runs needs the WorkflowRuntime logs path.
  4. Are your trigger or action payloads sensitive? IncludeContent on the Diagnostic Setting captures trigger and action inputs and outputs verbatim. Leave it off for PII-bearing workflows; enable it only when you need payload-level forensics and have reviewed the privacy trade-off.

If you are starting from zero, platform metrics are the lowest-effort win and cover run health plus cost. Add the logs path when you need per-action attribution for failed or slow runs.

What you'll monitor

The receiver scrapes the Microsoft.Logic/workflows namespace and emits per-workflow metrics under cloud.platform: azure_logic_apps. These metrics carry no Azure Monitor metadata dimensions - each is a single series per workflow, so cardinality stays minimal and there is no dimensional fan-out to budget for.

MetricAggregationWhat it tells you
RunsStartedTotalRuns initiated. Pairs with RunsCompleted to expose in-flight backlog.
RunsCompletedTotalRuns that reached a terminal state.
RunsSucceededTotalRuns that completed successfully.
RunsFailedTotalFailed runs. Primary run-health SLI.
RunsCancelledTotalCancelled runs.
RunLatencyAverage, MaximumEnd-to-end run duration. Average for trend, Maximum for tail. Traffic-driven.
ActionsStartedTotalActions begun across all runs.
ActionsCompletedTotalActions that reached a terminal state.
ActionsSucceededTotalActions that succeeded.
ActionsFailedTotalFailed actions. Tells you an action failed, not which - see §Logs.
ActionsSkippedTotalActions skipped by runAfter conditions.
ActionLatencyAverage, MaximumPer-action duration, aggregated. Traffic-driven.
TriggersStartedTotalTrigger evaluations begun.
TriggersCompletedTotalTrigger evaluations completed.
TriggersSucceededTotalSuccessful trigger evaluations.
TriggersFailedTotalFailed / rejected trigger evaluations.
TriggersFiredTotalTriggers that fired a run. The inbound-request counter for an HTTP Request trigger.
TriggersSkippedTotalTrigger evaluations that did not fire.
TriggerLatencyAverage, MaximumTrigger evaluation duration. Traffic-driven.
TotalBillableExecutionsTotalBillable executions. The canonical Consumption cost SLI. Traffic-driven.
BillableActionExecutionsTotalAction-side billing split. Traffic-driven.
BillableTriggerExecutionsTotalTrigger-side billing split. Traffic-driven.

Traffic-driven vs continuous. The count metrics (Runs*, Actions*, Triggers*) publish an explicit 0 every minute even at idle. The latency series (RunLatency, ActionLatency, TriggerLatency) and the three billing metrics publish data points only for minutes that contain completed runs. On a low-traffic workflow these are sparse by design - an empty RunLatency minute means no runs that minute, not a broken pipeline. When verifying the pipeline, drive traffic and read the metrics within the same active window.

Prerequisites

RequirementDetail
Hosting modelConsumption (Microsoft.Logic/workflows). Standard is out of scope - see Azure App Service.
OTel Collector Contribv0.151.0+ (the azure_monitor and azure_event_hub receiver names are snake_case from v0.148.0).
OpenTelemetry semconvv1.41.0 (latest cloud attributes).
Azure CLI2.85+ for the az monitor diagnostic-settings flags used here.
Azure providers registeredMicrosoft.Logic, plus Microsoft.EventHub and Microsoft.Insights for the logs path.
Collector runtimeSee Docker Compose Setup or Kubernetes / Helm Setup; this guide adds the Logic Apps receiver + processor blocks on top.
Scout exporterSee Scout exporter wiring for the oauth2client extension + otlp_http/b14 exporter. This guide does not re-derive that block.

Access setup

The metrics receiver authenticates via a Service Principal scoped to the resource group:

RoleScopeReason
Monitoring ReaderResource group containing the workflowLets the receiver list metric definitions and read metric values for Microsoft.Logic/workflows.

The logs path does not need a second role assignment. The azure_event_hub receiver consumes the diagnostic hub through the namespace-scoped SAS rule's Listen permission carried in its connection string; no Azure Event Hubs Data Receiver role on the namespace is required unless you switch the receiver to Azure AD auth.

If you reuse one Service Principal across many surfaces, the assignment is idempotent - re-running it on an already-granted SP is a no-op. It typically propagates in under 30 seconds; the first scrape after a fresh assignment may return 403 AuthorizationFailed and self-clears on the next 60s cycle.

Receiver configuration

Add the following alongside whatever already wires azure_auth and the Scout exporter. You do not need to duplicate the oauth2client extension or the otlp_http/b14 exporter; those live in the shared base config per Scout exporter wiring.

otel-collector.yaml (excerpt)
receivers:
azure_monitor/logicapps:
subscription_ids:
- ${env:AZURE_SUBSCRIPTION_ID}
resource_groups:
- ${env:LOGICAPPS_RESOURCE_GROUP}
services:
- Microsoft.Logic/workflows
auth:
authenticator: azure_auth
collection_interval: 60s
initial_delay: 1s
use_batch_api: false
cache_resources: 86400
dimensions:
enabled: true
metrics:
"Microsoft.Logic/workflows":
RunsStarted: [Total]
RunsCompleted: [Total]
RunsSucceeded: [Total]
RunsFailed: [Total]
RunsCancelled: [Total]
RunLatency: [Average, Maximum]
ActionsStarted: [Total]
ActionsCompleted: [Total]
ActionsSucceeded: [Total]
ActionsFailed: [Total]
ActionsSkipped: [Total]
ActionLatency: [Average, Maximum]
TriggersStarted: [Total]
TriggersCompleted: [Total]
TriggersSucceeded: [Total]
TriggersFailed: [Total]
TriggersFired: [Total]
TriggersSkipped: [Total]
TriggerLatency: [Average, Maximum]
TotalBillableExecutions: [Total]
BillableActionExecutions: [Total]
BillableTriggerExecutions: [Total]

processors:
resource/logicapps:
attributes:
- {key: cloud.provider, value: azure, action: insert}
- {key: cloud.platform, value: azure_logic_apps, action: insert}
- {key: cloud.account.id, value: "${env:AZURE_SUBSCRIPTION_ID}", action: insert}
- {key: cloud.region, value: "${env:LOGICAPPS_REGION}", action: insert}
- {key: cloud.resource_id, value: "${env:LOGICAPPS_RESOURCE_ID}", action: insert}
- {key: deployment.environment.name, value: "${env:ENVIRONMENT}", action: insert}
- {key: service.name, value: "${env:LOGICAPPS_SERVICE_NAME}", action: insert}

service:
pipelines:
metrics/logicapps:
receivers: [azure_monitor/logicapps]
processors: [memory_limiter, resource/logicapps, batch]
exporters: [otlp_http/b14]

use_batch_api: false issues one query per metric per scrape. A single Consumption workflow is a small query budget; switch to true only if you scale this receiver across many workflows in one subscription.

Environment variables

.env
AZURE_SUBSCRIPTION_ID=...
LOGICAPPS_RESOURCE_GROUP=... # RG containing the workflow
LOGICAPPS_REGION=... # for cloud.region; defaults to the RG region
LOGICAPPS_RESOURCE_ID=... # full ARM ID of the workflow
LOGICAPPS_SERVICE_NAME=logic-apps-monitor
ENVIRONMENT=production

Service Principal credentials (AZURE_TENANT_ID, AZURE_CLIENT_ID, AZURE_CLIENT_SECRET) and Scout exporter credentials come from the shared base config and are not listed here. See Scout exporter wiring.

Operations

RBAC propagation lag

Monitoring Reader on the resource group typically propagates in under 30 seconds, occasionally up to 120 seconds. The first scrape after a fresh assignment may return 403 AuthorizationFailed; the receiver retries on the next 60s cycle and the noise clears within two polls.

Latency and billing metrics are traffic-driven

RunLatency, ActionLatency, TriggerLatency, and the three billing metrics publish data points only for minutes that contain completed runs. The count metrics publish a continuous 0. Do not alert on the absence of latency data points - alert on the values when present. When smoke-testing, drive a run and read the metric inside the same minute window; a read several minutes after the last run shows only the continuous count series.

Metrics aggregate per workflow only

Microsoft.Logic/workflows metrics carry no metadata dimensions. There is no per-action or per-trigger breakdown in the metric stream - ActionsFailed is a workflow-wide counter. Per-action attribution is the job of the WorkflowRuntime logs path (§Logs). This also means the metric cardinality is one series per metric per workflow, so the receiver's query and storage footprint scales linearly with workflow count and nothing else.

Consumption concurrency throttling

An HTTP Request trigger enforces a per-workflow concurrency limit. Aggressive callers receive HTTP 429 with a Retry-After header rather than queuing unbounded. This surfaces as TriggersFailed / TriggersSkipped in the metric stream; if you see them climb under load, the caller is exceeding the workflow's concurrency budget, not the pipeline failing.

WorkflowRuntime logs warm-up

On a freshly started collector the azure_event_hub receiver spends roughly 15-20 minutes establishing its Event Hubs consumer before the first record is delivered downstream, even when records are already in the hub. The Diagnostic Settings → Event Hubs leg is fast for Logic Apps Consumption (records reach the hub within minutes of a run). Budget 20 minutes before treating an empty logs pipeline as broken, and do not restart the collector during that window - a restart resets the warm-up.

Key alerts to configure

Once metrics are flowing, set up alerts on these thresholds. The "Why" column gives the reasoning so you can adjust for your traffic shape.

SignalWarningCriticalWhy
RunsFailed / RunsCompleted ratio (5 min)> 1%> 5%Run failure rate; the primary Logic Apps SLI. Sustained failure usually means a downstream dependency or a bad workflow revision.
ActionsFailed rate (5 min)> 1% of ActionsCompleted> 5% of ActionsCompletedA single action failing repeatedly. Pair with the logs path to identify which action.
RunLatency Average (5 min)> 1.5× rolling 24h mean> 3× rolling 24h meanRun-duration regression. Relative thresholds track normal workflow behaviour better than absolute seconds.
TriggersFailed (5 min)> 0 sustained> 1% of TriggersFiredRejected or throttled trigger calls; co-fires with caller HTTP 429.
TotalBillableExecutions (1 h)> 1.5× rolling 7d hourly mean> 3× rolling 7d hourly meanCost-runaway detector. A workflow change that adds actions or loops shows here first.
RunsStartedRunsCompleted backlog (10 min)> 10 sustained> 50 sustainedIn-flight runs not completing; indicates a stuck action or a downstream timeout.

Configure the Scout-side alert rules through your dashboarding / alerting stack once thresholds are decided; the receiver pipeline above emits the underlying signals continuously.

Logs

The WorkflowRuntime Diagnostic Settings category fills the gaps the metric whitelist structurally cannot. The §Logs path uses the azure_event_hub receiver against a Diagnostic Settings → Event Hubs sink.

What logs uniquely fill

Platform metrics aggregate per workflow. Logs disaggregate. The gaps logs uniquely cover for Logic Apps:

  • Which action failed. ActionsFailed is a workflow-wide counter. The log stream names the action, its status, its error code, and the run it belonged to.
  • Per-trigger attribution. Which caller / source fired which run, with the client tracking id, in what order across a burst.
  • Per-run forensic ordering. The sequence of trigger fired → actions started → actions ended → run completed for a specific run, which metrics flatten into per-minute counts.
  • Trigger and action payloads (opt-in). With IncludeContent enabled, the verbatim inputs and outputs of the trigger and each action - the only way to see what data a specific run actually processed.

Architecture

Logic Apps workflow (Consumption)

│ Diagnostic Setting (resource scope)
│ category: WorkflowRuntime

Event Hubs namespace (Basic 1 TU)
│ • diagsend SAS rule (Send) writes records
│ • collectorlisten SAS rule (Listen) reads records

azure_event_hub receiver
│ • format: azure
│ • apply_semantic_conventions: true
│ • multi-record envelopes decoded into individual log records

otlp_http/b14 → Scout

The Diagnostic Setting targets the workflow resource directly.

Category enabled by default

CategoryWhat it covers
WorkflowRuntimeWorkflow run events: trigger fired (source, client tracking id), action started / succeeded / failed (per-action name, status, error code), run completed (final status, duration). With IncludeContent: true, also the verbatim trigger and action inputs / outputs. The only Logic Apps log category; AllMetrics is the metrics surface, covered by the metrics path above.

IncludeContent is an opt-in knob, not a default. Leave it off unless you need per-run inputs and outputs. When enabled, trigger and action payloads are captured verbatim - review against your PII policy before turning it on, and budget for Event Hubs Standard rather than Basic if your workflows move large JSON payloads (records approach the 256 KB per-event Basic limit and can throttle 1 TU on bursty traffic).

Receiver configuration (logs)

otel-collector.yaml (excerpt)
receivers:
azure_event_hub/logicappslogs:
connection: ${env:LOGICAPPSLOGS_CONNECTION_STRING}
partition: "" # consume all partitions
offset: "" # resume from last checkpoint
format: azure # decode the Azure resource-log envelope
apply_semantic_conventions: true

processors:
resource/logicappslogs:
attributes:
- {key: cloud.provider, value: azure, action: insert}
- {key: cloud.platform, value: azure_logic_apps, action: insert}
- {key: cloud.account.id, value: "${env:AZURE_SUBSCRIPTION_ID}", action: insert}
- {key: cloud.region, value: "${env:LOGICAPPSLOGS_SOURCE_REGION}", action: insert}
- {key: cloud.resource_id, value: "${env:LOGICAPPSLOGS_SOURCE_RESOURCE_ID}", action: insert}
- {key: deployment.environment.name, value: "${env:LOGICAPPSLOGS_ENVIRONMENT}", action: insert}
- {key: service.name, value: "${env:LOGICAPPSLOGS_SERVICE_NAME}", action: insert}

service:
pipelines:
logs/logicappslogs:
receivers: [azure_event_hub/logicappslogs]
processors: [memory_limiter, resource/logicappslogs, batch]
exporters: [otlp_http/b14]

The receiver decodes multi-record Azure envelopes into individual OTel log records, so the downstream record count is higher than the raw envelope count the receiver reports - that fan-out is expected, not duplication.

Environment variables (logs)

.env (logs path)
LOGICAPPSLOGS_CONNECTION_STRING=... # Listen SAS with ;EntityPath=<hub>
LOGICAPPSLOGS_SOURCE_REGION=... # for cloud.region on log records
LOGICAPPSLOGS_SOURCE_RESOURCE_ID=... # workflow ARM ID for cloud.resource_id
LOGICAPPSLOGS_SERVICE_NAME=logic-apps-logs
LOGICAPPSLOGS_ENVIRONMENT=production

Quote the connection string in your env file. It carries ;-separated segments - Endpoint=..., SharedAccessKeyName=..., SharedAccessKey=..., EntityPath=.... Sourced unquoted into a shell, the ; truncates the value at the first segment. Most env-file loaders handle this, but a shell-sourced .env needs single quotes.

Wiring the Diagnostic Setting

attach the Diagnostic Setting
az monitor diagnostic-settings create \
--resource "<workflow-resource-id>" \
--name logicapps-runtime \
--event-hub <hub-name> \
--event-hub-rule "<diagsend-rule-id>" \
--logs '[{"category":"WorkflowRuntime","enabled":true}]'

The --event-hub-rule value is the resource ID of the namespace-scoped SAS rule with Send permission. The receiver uses a separate Listen rule; one Send rule and one Listen rule on the namespace is the canonical two-rule topology. To capture payloads, add "includeContent":true to the log setting (review the PII trade-off first).

Verifying the logs path

After the Diagnostic Setting is attached and the workflow has completed at least one run:

  1. Wait up to 20 minutes. The Diagnostic Settings → Event Hubs leg is fast (records reach the hub within minutes), but the azure_event_hub receiver takes 15-20 minutes to establish its consumer on a freshly started collector. Do not restart the collector during this window - a restart resets the warm-up.
  2. Tail the collector debug exporter: docker compose logs -f otel-collector | grep "otelcol.signal.*logs".
  3. Expect batches of log records every 60-90 seconds while runs are flowing; the receiver-reported envelope count is lower than the downstream record count because envelopes carry multiple records.
  4. In Scout, filter service.name = 'logic-apps-logs' and cloud.platform = 'azure_logic_apps' to confirm records land.

Troubleshooting

AuthorizationFailed on the first scrape

Cause: The Monitoring Reader assignment on the resource group has not propagated yet. Fix: Wait two polling cycles (~2 minutes). The receiver retries automatically; the error self-clears.

Latency or billing metrics show no data

Cause: These metrics are traffic-driven - Azure publishes data points only for minutes containing completed runs. Fix: Drive a run and read the metric inside the same minute. An idle workflow shows only the continuous count series; that is correct behaviour, not a gap.

Logs pipeline empty for ~20 minutes after wiring

Cause: The azure_event_hub receiver's initial consumer establishment takes 15-20 minutes on a freshly started collector. Fix: Wait it out. Confirm records are arriving in the hub (az monitor metrics list --metric IncomingMessages on the Event Hubs namespace shows non-zero) to prove the Diagnostic Setting is delivering; the gap is downstream warm-up, not the Diagnostic Setting. Do not restart the collector to "kick" it - that resets the warm-up clock.

Caller receives HTTP 429 from the trigger

Cause: The workflow's per-instance concurrency limit. Fix: Space the calls out or raise the workflow's concurrency control. This is Logic Apps protecting the workflow, not a telemetry fault; it shows as TriggersFailed / TriggersSkipped in the metric stream.

azure_event_hub receiver logs MessagingGatewayBadRequest

Cause: The receiver is requesting a user-defined consumer group that does not exist on Event Hubs Basic. Fix: Basic rejects user-defined consumer groups - the receiver must consume from $Default, the implicit group. Remove any consumer_group: key, or move the namespace to Standard if you need multiple consumers.

Metric set is empty although the workflow exists

Cause: The workflow is Standard hosting (Microsoft.Web/sites kind=workflowapp), not Consumption. Fix: This guide's receiver scopes Microsoft.Logic/workflows. For Standard, follow Azure App Service.

Frequently Asked Questions

How do I monitor Azure Logic Apps with OpenTelemetry?

Logic Apps Consumption has two telemetry paths. Platform metrics use the azure_monitor receiver against Microsoft.Logic/workflows for run, action, and trigger lifecycle counts, latency, and billable executions. Resource logs use the azure_event_hub receiver consuming the WorkflowRuntime Diagnostic Settings category for per-action drill-down. There is no in-system scrape path: the Consumption runtime is fully Microsoft-managed with no data-plane endpoint.

Why are RunLatency and billable-execution metrics empty on my Logic App?

Logic Apps latency and billable-execution metrics are traffic-driven: Azure publishes data points only for minutes that contain completed runs. The run, action, and trigger count metrics publish an explicit zero every minute even at idle, but latency and billing series are sparse by design on a low-traffic workflow. An empty minute means no runs that minute, not a broken pipeline.

Which Logic Apps cost metric should I alert on?

TotalBillableExecutions is the canonical Consumption cost SLI - Consumption bills per billable execution. BillableActionExecutions and BillableTriggerExecutions split that total between action and trigger executions so you can attribute cost growth to a specific workflow change.

Why does the WorkflowRuntime logs pipeline take ~20 minutes?

On a freshly started collector the azure_event_hub receiver spends roughly 15 to 20 minutes establishing its Event Hubs consumer before it delivers the first record downstream, even when records are already in the hub. The Diagnostic Settings to Event Hubs leg itself is fast for Logic Apps Consumption - records reach the hub within minutes of a run. Budget 20 minutes before treating an empty logs pipeline as broken, and do not restart the collector during that window because a restart resets the warm-up.

Can surface metrics tell me which action in my workflow failed?

No. Microsoft.Logic/workflows metrics carry no metadata dimensions - they aggregate per workflow only. ActionsFailed tells you an action failed, not which one. The WorkflowRuntime logs path gives per-action name, status, and error code, so debugging a failed run needs the logs path.

Does this guide cover Logic Apps Standard?

No. Standard hosting runs on Microsoft.Web/sites with kind=workflowapp and emits App Service-shaped metrics plus App Service Plan signals. Follow Azure App Service for Standard. This guide is Consumption only (Microsoft.Logic/workflows).

Shared collector + Scout wiring

Adjacent Azure surfaces

  • Azure App Service - where Logic Apps Standard runs; the path for workflow apps not on Consumption.
  • Azure Event Hubs - the streaming layer this guide's logs path runs through; monitor the hub itself when log volume grows.
  • Azure Service Bus - the messaging backbone many workflows trigger from or publish to.
  • Azure API Management - the gateway that commonly fronts HTTP-triggered workflows.
  • Azure Key Vault - the secrets store workflows reference for connection credentials.

Migrating from Application Insights

  • Application Insights migration - moving app-side telemetry off Application Insights while keeping Azure Monitor as the metric source for this guide.
Was this page helpful?