﻿---
title: Optimize model context
description: Context is the information you provide to the LLM, to optimize the relevance of your query results. Without additional context, an LLM generates results...
url: https://www.elastic.co/docs/solutions/elasticsearch-solution-project/playground-context
products:
  - Kibana
applies_to:
  - Elastic Cloud Serverless: Beta
  - Elastic Stack: Beta since 9.1, Preview in 9.0
---

# Optimize model context
Context is the information you provide to the LLM, to optimize the relevance of your query results. Without additional context, an LLM generates results solely based on its training data. In Playground, this additional context is the information contained in your Elasticsearch indices.
There are a few ways to optimize this context for better results. Some adjustments can be made directly in the Playground UI. Others require refining your indexing strategy, and potentially reindexing your data.
<note applies-to="Elastic Stack: Preview since 9.0">
  Only **one field** can be selected as context for the LLM.
</note>


## Edit context in UI

<applies-to>
  - Elastic Stack: Removed in 9.1
  - Elastic Stack: Preview in 9.0
</applies-to>

Use the **Playground context** section in the Playground UI to adjust the number of documents and fields sent to the LLM.
If you’re hitting context length limits, try the following:
- Limit the number of documents retrieved
- Pick a field with less tokens, reducing the context length


## Other context optimizations

This section covers additional context optimizations that you won’t be able to make directly in the UI.

### Chunking large documents

If you're working with large fields, you might need to adjust your indexing strategy. Consider breaking your documents into smaller chunks, such as sentences or paragraphs.
If you don’t yet have a chunking strategy, start by chunking your documents into passages.
Otherwise, consider updating your chunking strategy, for example, from sentence based to paragraph based chunking.
Refer to the following Python notebooks for examples of how to chunk your documents:
- [JSON documents](https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/ingestion-and-chunking/json-chunking-ingest.ipynb)
- [PDF document](https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/ingestion-and-chunking/pdf-chunking-ingest.ipynb)
- [Website content](https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/ingestion-and-chunking/website-chunking-ingest.ipynb)


### Optimizing context for cost and performance

The following recommendations can help you balance cost, latency, and result quality when working with different context sizes:
<definitions>
  <definition term="Optimize context length">
    Determine the optimal context length through empirical testing. Start with a baseline and adjust incrementally to find a balance that optimizes both response quality and system performance.
  </definition>
  <definition term="Implement token pruning for ELSER model">
    If you’re using our ELSER model, consider implementing token pruning to reduce the number of tokens sent to the model. Refer to these relevant blog posts:
    - [Optimizing retrieval with ELSER v2](https://www.elastic.co/search-labs/blog/introducing-elser-v2-part-2)
    - [Improving text expansion performance using token pruning](https://www.elastic.co/search-labs/blog/text-expansion-pruning)
  </definition>
  <definition term="Monitor and adjust">
    Continuously monitor the effects of context size changes on performance and adjust as necessary.
  </definition>
</definitions>