Rate limiting

Rate limiting occurs when the Elastic Cloud Managed OTLP Endpoint receives data faster than it can process and index into Elasticsearch. The endpoint responds with HTTP 429 errors until the data volume is reduced.

How rate limiting works

Rate limiting behavior differs by deployment type:

Elastic Cloud Hosted: Rate limits depend on your Elasticsearch cluster capacity. If your cluster can't keep up with incoming data, the endpoint starts rejecting requests with 429 errors.
Elastic Cloud Serverless: Elastic manages scaling automatically. Rate limiting is rare and typically indicates a temporary event to protect our system.

Identifying rate limiting

When rate limiting occurs, the Elastic Cloud Managed OTLP Endpoint responds with an HTTP 429 Too Many Requests status code. A log message similar to this appears in the OpenTelemetry Collector's output:

		"error": "rpc error: code = ResourceExhausted desc = request exceeded available capacity"
		
	

For troubleshooting steps, refer to Error: too many requests.

Resolving rate limiting

Elastic Cloud Hosted deployments

For Elastic Cloud Hosted deployments, 429 errors typically indicate that your Elasticsearch cluster is undersized for the current data volume. If AutoOps is available in your region, use it to check CPU utilization, index queue depth, and node load to confirm whether your cluster is under-resourced. If AutoOps is not available in your region, contact Elastic Support.

If metrics confirm the cluster needs more capacity, scale your deployment:

Once your Elasticsearch capacity is scaled up or is able to accept the incoming data volume, requests to Elastic Cloud Managed OTLP Endpoint will be accepted again.

Elastic Cloud Serverless deployments

For Elastic Cloud Serverless projects, Elastic manages backend scaling automatically. If you experience persistent 429 errors, contact Elastic Support.