APM Server version 7.0 introduces breaking changes with older versions of APM agents. Check the agent/server compatibility matrix for compatibility information.
If your Elasticsearch cluster is not ingesting the amount of data you expect, you can tweak a few APM Server settings:
output.elasticsearch.workers. See tune for indexing speed for an overview.
output.elasticsearch.bulk_max_sizeis set to a high value, for example 5120. The default of 50 is very conservative.
- Ensure that
queue.mem.eventsis set to a reasonable value compared to your other settings. A good rule of thumb is that
The output configuration section shows more details.
APM Server uses an internal queue to buffer incoming events.
A larger queue can retain more data if Elasticsearch is unavailable for longer periods,
and it alleviates problems that might result from sudden spikes of traffic.
You can adjust the queue size by overriding
queue.mem.events can significantly affect APM Server memory usage.
If the APM Server cannot process data quickly enough, you will see request timeouts.
One way to solve this problem is to increase processing power. This can be done by either migrating your APM Server to a more powerful machine or adding more APM Server instances. Having several instances will also increase availability.
Large payloads may result in request timeouts. You can reduce the payload size by decreasing the flush interval in the agents. This will cause agents to send smaller and more frequent requests.
Read more in the agents documentation.
Agents make use of long running requests and flush as many events over a single request as possible. Thus, the rate limiter for RUM is bound to the number of events sent per second, per IP.
If the rate limit is hit while events on an established request are sent, the request is not immediately terminated. The intake of events is only throttled to
event_rate.limit, which means that events are queued and processed slower. Only when the allowed buffer queue is also full, does the request get terminated with a
429 - rate limit exceeded HTTP response. If an agent tries to establish a new request, but the rate limit is already hit, a
429 will be sent immediately.
event_rate.limit default value will help avoid
rate limit exceeded errors.