Token usage in Elastic Agent Builder

When working with Elastic Agent Builder, total token usage typically exceeds the visible conversation text. Because Elastic Agent Builder uses an agentic framework, a single user request often triggers multiple model calls to process reasoning steps, run tools, and interpret results.

Token counts include:

Input tokens: These are tokens sent to the model, which accumulate throughout the session. They include the user's current query, the conversation history from previous rounds, system prompts, and the results returned from any tools used during execution.
Output tokens: These are tokens generated by the model. These include the final response visible to the user, as well as all internal reasoning steps, tool calls, and intermediate results generated by the model.

Note

Each conversation round includes all previous rounds as context. This means token usage at each step depends on the entire conversation size, not only the current message.

For more information on billing and token costs, refer to Elastic pricing.

How to view token usage

At the end of each round, the total token usage is displayed after the agent response. Input tokens are represented by , and output tokens by :

Screenshot of the token usage display, showing input and output token counts

To view the raw JSON response which includes detailed token information, click the View JSON button. This opens a modal with the complete, raw response data:

Screenshot of the JSON raw response modal