Loading

Resubmit failed events

When EDOT Cloud Forwarder for Azure cannot forward a batch of events to Elasticsearch, it saves the batch as a blob in the error store — an Azure Blob Storage container in the same storage account as the deployment. You can use the ecf-replay-cli tool to resubmit individual blobs back to Event Hub so they are picked up by EDOT Cloud Forwarder and forwarded to Elasticsearch again.

When a function invocation fails to deliver events to Elasticsearch, EDOT Cloud Forwarder writes the raw event batch to a container in the storage account:

  • ecf-errors-logs — failed log batches
  • ecf-errors-metrics — failed metric batches

Each blob is named using a time-partitioned path (YYYY/MM/DD/HH/MM/<uuid>.json) and carries the following metadata and index tags:

Key Description
ecf_failed_at ISO 8601 UTC timestamp when the failure was recorded, for example 2026-05-29T12:00:00Z
ecf_error_class permanent or transient — see Error classes
ecf_function Azure Function name that processed the batch (for example, logs)
ecf_signal Signal type: logs, metrics, or traces. May be absent for pre-signal dispatcher errors
ecf_status Processing status: new (set on write), retried (set after successful resubmission). Index tag only.
ecf_error_message Error message (metadata only; not a tag)
ecf_event_hub_namespace Fully qualified Event Hub namespace host
ecf_event_hub_name Event Hub name to resubmit to

ecf-replay-cli reads these three tags to automatically resolve the resubmission destination, so you don't need to pass --function, --event-hub-namespace, and --event-hub-name flags manually. Flags take precedence over tags if you need to override.

Class Meaning Action
transient A temporary condition such as a network hiccup or a brief Elasticsearch unavailability. Usually safe to resubmit. Review ecf_error_message to confirm the root cause is resolved before resubmitting.
permanent A non-retriable failure such as an authentication error, a malformed payload, or a persistent endpoint misconfiguration. Investigate ecf_error_message and resolve the underlying cause before resubmitting.

Every blob starts with ecf_status = "new". After a successful resubmission, ecf-replay-cli updates the ecf_status index tag to "retried". This lets you filter for blobs that still need to be resubmitted:

az storage blob filter \
  --account-name <storage-account-name> \
  --container-name ecf-errors-logs \
  --tag-filter "ecf_status = 'new'" \
  --auth-mode login \
  --query "blobs[].name" -o tsv
		
Note

Blobs written before ecf_status was introduced have no such tag. They are resubmitted normally and the tag is set to "retried" on success.

If the tag update fails (for example, due to insufficient permissions), the tool prints a warning to stderr and exits with code 0 — the events were already published to Event Hub.

  • Azure CLI: you must be logged in (az login) with an identity that has the required Azure roles (see below).
  • The ecf-replay-cli binary downloaded from the release assets.
Role Scope Required Notes
Storage Blob Data Reader Error container Yes Read blob content and tags.
Azure Event Hubs Data Sender Target Event Hub Yes Publish events back to Event Hub.
Storage Blob Index Contributor Error container No Update the ecf_status tag after resubmission. If absent, the tag update is skipped with a warning.

ecf-replay-cli picks up your Azure credentials automatically from the environment — environment variables, workload identity, or an active Azure CLI session. Running az login is the simplest option for local use.

Use the Azure CLI to list blobs in the error container. You can filter by time prefix or by index tags.

List all unprocessed blobs in the logs error container:

az storage blob filter \
  --account-name <storage-account-name> \
  --container-name ecf-errors-logs \
  --tag-filter "ecf_status = 'new'" \
  --auth-mode login \
  --query "blobs[].name" -o tsv
		

List all blobs (including already resubmitted):

az storage blob list \
  --account-name <storage-account-name> \
  --container-name ecf-errors-logs \
  --auth-mode login \
  --query "[].name" -o tsv
		

Filter by date prefix:

az storage blob list \
  --account-name <storage-account-name> \
  --container-name ecf-errors-logs \
  --prefix "2026/05/29/" \
  --auth-mode login \
  --query "[].name" -o tsv
		

Filter by error class and function:

az storage blob filter \
  --account-name <storage-account-name> \
  --container-name ecf-errors-logs \
  --tag-filter "ecf_error_class = 'permanent' AND ecf_function = 'logs' AND ecf_status = 'new'" \
  --auth-mode login \
  --query "blobs[].name" -o tsv
		

To inspect the metadata and tags on a specific blob:

az storage blob show \
  --account-name <storage-account-name> \
  --container-name ecf-errors-logs \
  --name "2026/05/29/12/00/<uuid>.json" \
  --auth-mode login \
  --query "{metadata:properties.metadata, tags:tags}"
		

If the blob has ecf_event_hub_namespace and ecf_event_hub_name tags or metadata, you can omit those flags and the tool will resolve them automatically:

ecf-replay-cli \
  --storage-account-url "https://<storage-account-name>.blob.core.windows.net/" \
  --container ecf-errors-logs \
  --blob "2026/05/29/12/00/<uuid>.json"
		

If the blob is missing metadata or you want to override the destination, pass all flags explicitly:

ecf-replay-cli \
  --storage-account-url "https://<storage-account-name>.blob.core.windows.net/" \
  --container ecf-errors-logs \
  --blob "2026/05/29/12/00/<uuid>.json" \
  --function logs \
  --event-hub-namespace "<namespace>.servicebus.windows.net" \
  --event-hub-name logs
		

On success, the tool prints the number of resubmitted messages:

replayed 3 message(s) to <namespace>.servicebus.windows.net/logs
		

The events are published to Event Hub in their original order, EDOT Cloud Forwarder picks them up and forwards them to Elasticsearch. The ecf_status tag on the blob is then updated to "retried".

All flags can also be set via environment variables. Flags take precedence over environment variables and blob metadata.

Flag Environment variable Required Description
--storage-account-url ECF_REPLAY_STORAGE_ACCOUNT_URL Yes Blob storage account URL, for example https://<account>.blob.core.windows.net/
--container ECF_REPLAY_CONTAINER Yes Blob container name, for example ecf-errors-logs
--blob ECF_REPLAY_BLOB Yes Blob path within the container
--function ECF_REPLAY_FUNCTION No* Function name used to decode the blob payload. Falls back to the ecf_function blob tag or metadata.
--event-hub-namespace ECF_REPLAY_EVENT_HUB_NAMESPACE No* Fully qualified Event Hub namespace host. Falls back to the ecf_event_hub_namespace blob tag or metadata.
--event-hub-name ECF_REPLAY_EVENT_HUB_NAME No* Event Hub name. Falls back to the ecf_event_hub_name blob tag or metadata.

*Required if not present in blob tags or metadata.