To ensure that communication with Fleet Server is encrypted, Fleet Server requires Elastic Agents to present a signed certificate. In a self-managed cluster, if you don’t specify certificates when you set up Fleet Server, self-signed certificates are generated automatically.

If you attempt to enroll an Elastic Agent in a Fleet Server with a self-signed certificate, you will encounter the following error:

		Error: fail to enroll: fail to execute request to fleet-server: x509: certificate signed by unknown authority
Error: enroll command failed with exit code: 1

To fix this problem, pass the --insecure flag along with the enroll or install command. For example:

		sudo ./elastic-agent install --url=https://<fleet-server-ip>:8220 --enrollment-token=<token> --insecure
		
	

Traffic between Elastic Agents and Fleet Server over HTTPS is encrypted. By adding this flag, you are acknowledging that you understand that the certificate chain cannot be verified.

Allowing Fleet Server to generate self-signed certificates is useful to get things running for development, but not recommended in a production environment.

For more information, refer to Configure SSL/TLS for self-managed Fleet Servers.

Elastic Agent enrollment fails on the host with `x509: cannot validate certificate for x.x.x.x because it doesn't contain any IP SANs` message

To ensure that communication with Elasticsearch is encrypted, Fleet Server requires Elasticsearch to present a signed certificate.

This error occurs when you use self-signed certificates with Elasticsearch using IP as a Common Name (CN). With IP as a CN, Fleet Server looks into subject alternative names (SANs), which are empty. To work around this situation, use the --fleet-server-es-insecure flag to deactivate certificate verification.

You will also need to set ssl.verification_mode: none in the Output settings in Fleet and Integrations UI.

Elastic Agent enrollment fails on the host with `Client.Timeout exceeded` message

To enroll in Fleet, Elastic Agent must connect to the Fleet Server instance. If the agent cannot connect, you get failures similar to these:

		fail to enroll: fail to execute request to Fleet Server:Post http://fleet-server:8220/api/fleet/agents/enroll?: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
		
	

Here are several steps to help you troubleshoot the problem.

Check for networking problems. From the host, run the ping command to confirm that it can reach the Fleet Server instance.

Additionally, curl the /status API of Fleet Server:

curl -f http://<fleet-server-url>:8220/api/status

Verify that you have specified the correct Kibana Fleet settings URL and port for your environment.

By default, HTTPS protocol and port 8220 is expected by Fleet Server to communicate with Elasticsearch unless you have explicitly set it otherwise.
Check that you specified a valid enrollment key during enrollment. To do this:
1. In Fleet, select Enrollment tokens.
2. To view the secret, click the eyeball icon. The secret should match the string that you used to enroll Elastic Agent on your host.
3. If the secret doesn’t match, create a new enrollment token and use this token when you run the elastic-agent enroll command.

Elastic Agent enrollment fails on the host with `Error while dialing: open \\\\.\\pipe\\elastic-agent-system: The system cannot find the file specified` message

Elastic Agent might fail to install in a Windows environment due to port conflicts and file locks, returning this error:

		Restart attempt 2 failed: 'rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: open \\\\.\\pipe\\elastic-agent-system: The system cannot find the file specified.
		
	

To resolve port conflicts:

Check for any processes that are using port 6789 or 6790:
```
netstat -ano | findstr :6789
netstat -ano | findstr :6790
		
```
This will return the process ID (PID) of the application that's using the specified port. You can then identify the application using its PID:
```
tasklist /fi "pid eq <APP-PID>"
		
```
In case of a port conflict, update agent.grpc.port in the elastic-agent.yml file to bind the agent to a different port (for example, 6790).

If you’re testing Fleet Server locally on a macOS system using localhost (https://127.0.0.1:8220) as the Host URL, you might encounter this error:

		Error: fail to enroll: fail to execute request to fleet-server:
lookup My-MacBook-Pro.local: no such host

This can occur on newer macOS software. To resolve the problem, ensure that file sharing is enabled on your local system.

Elastic Agent hangs while unenrolling

When unenrolling Elastic Agent, Fleet waits for acknowledgment from the agent before it completes the unenroll process. If Fleet doesn’t receive an acknowledgment, the status hangs at unenrolling.

You can unenroll an agent to invalidate all API keys related to the agent and change the status to inactive so that the agent no longer appears in Fleet.

In Fleet, select Agents.
Under Agents, choose Unenroll agent from the Actions menu next to the agent you want to unenroll.
Click Force unenroll.

Elastic Agent is automatically unenrolled after failed check-ins with 401 errors

In Elastic Agent versions prior to 8.19.0 and 9.1.0, if an agent receives a 401 (Unauthorized) error on more than seven consecutive check-ins with Fleet Server, the agent is automatically unenrolled.

To resolve the issue:

Re-enroll the agent
- If the agent is still installed on the host, re-enroll it in Fleet to keep the agent's existing state, including any previously ingested data:
  1. Open the Agents tab, then click Add agent.
  2. In the Add agent flyout, select the agent policy in which to re-enroll the agent.
  3. In the Authentication settings section, select an enrollment token:
    - If one or more active enrollment tokens exist for your agent policy, select one from the dropdown.
    - If no active tokens exist, click Create enrollment token. For detailed instructions, refer to Create enrollment tokens.
  4. Make sure Enroll in Fleet is selected.
  5. Select the appropriate platform, then copy the elastic-agent install command from the UI, and replace install with enroll.
  6. Run the modified command with elevated privileges from the directory where the agent is installed. For example:
    sudo ./elastic-agent enroll --url=<fleet-server-url> --enrollment-token=<token>
    Refer to the command reference for details about the available options.
- If the agent is no longer installed on the host, reinstall and enroll it in Fleet. Refer to Install Fleet-managed Elastic Agents for detailed instructions.
Resolve the underlying issues

Investigate the cause of the 401 errors and resolve the underlying issues to ensure proper agent functionality.

401 errors during check-in typically indicate authentication or authorization problems. Common causes include:
- Expired or revoked API keys
- Incorrect Fleet Server configuration
- Issues with Elasticsearch authentication settings

Agents are no longer automatically unenrolled

The automatic unenrollment behavior is removed in Elastic Agent versions 8.19.0 and 9.1.0. Starting with these versions, Elastic Agents are no longer automatically unenrolled due to repeated 401 errors during check-in. When the issue causing the errors is resolved, the agents automatically reconnect to Fleet and resume ingesting data.

Elastic Agent upgrade fails on Windows with exit status `0xc0000142`

During an Elastic Agent upgrade on Windows, Elastic Agent spawns a "watcher" process that monitors the upgrade process. Windows attempts to create a temporary console for this process. If Windows can't create this console, the watcher process initialization fails with error code 0xc0000142 (STATUS_DLL_INIT_FAILED), resulting in an upgrade failure. Elastic Agent logs this error at the info level.

The error is caused by Windows desktop heap exhaustion. When Elastic Agent runs as a Windows service application, it uses the service desktop, and shares the desktop heap with other running services. If a service process is using windowing resources, but is failing to release them, this can exhaust the desktop heap and affect Elastic Agent.

Note

Interactively-run instances of elastic-agent.exe are not subject to this limitation. Only instances running as a service are potentially affected.

To resolve the issue, try these tips:

Update Elastic Agent immediately after a system reboot

A system reboot destroys and recreates the desktop heap, resolving any prior exhaustion. Because many memory leaks are gradual, updating Elastic Agent immediately after a system reboot might allow Elastic Agent to upgrade before the memory leaking application exhausts the desktop heap.

Tip

A cold startup resets kernel memory, but a fast startup or a wake from hibernation does not. A regular reboot (for example, shutdown /r /t 0) results in a cold startup, and resets the desktop heap.
Update third-party service applications

As standard Windows tools such as Task Manager and Process Explorer do not attribute desktop heap usage by application, you have to consider updating all third-party processes that are running as a service. To list these applications, use the following PowerShell command:
```
PS C:\> Get-Process | Where {$_.SI -eq 0} | Where {$_.MainModule.FileVersionInfo.ProductName -and (-not (($_.MainModule.FileVersionInfo.CompanyName -eq "Microsoft Corporation") -and ($_.MainModule.FileVersionInfo.ProductName -like "*Windows*"))) } | ForEach-Object { $_.MainModule.FileVersionInfo.ProductName + ' - ' + $_.Path }
		
```
You can then install any updates from the listed applications' manufacturers.
Stop or uninstall third-party service applications

You can try terminating or uninstalling non-critical third-party service applications before updating Elastic Agent. Terminating a process releases its desktop heap resources.

Note that the Elastic Agent update process does not require a significant amount of desktop heap resources, so a successful Elastic Agent update following the termination or uninstallation of a service application does not necessarily mean that the application was exhausting the desktop heap.
Resize the desktop heap

As a short-term solution, follow the steps described in the Microsoft guide to increase the size of the desktop heap. If a service application is causing a memory leak, increasing the size of the desktop heap might only postpone the desktop heap exhaustion.

Elastic Agent unenroll fails

In Fleet, if you delete an Elastic Agent policy that is associated with one or more inactive enrolled agents, when the agent returns back to a Healthy or Offline state, it cannot be unenrolled. Attempting to unenroll the agent results in an Error unenrolling agent message, and the unenrollment fails.

To resolve this problem, you can use the Kibana Fleet APIs to force unenroll the agent.

To uninstall a single Elastic Agent:

		POST kbn:/api/fleet/agents/<agent_id>/unenroll
{
  "force": true,
  "revoke": true
}
		
	

To bulk uninstall a set of Elastic Agents:

		POST kbn:/api/fleet/agents/bulk_unenroll
{ "agents": ["<agent_id1>", "<agent-id2>"],
  "force": true,
  "revoke": true
}
		
	

We are also updating the Fleet UI to prevent removal of an Elastic Agent policy that is currently associated with any inactive agents.

Uninstalling Elastic Endpoint fails

When you uninstall Elastic Agent, all the programs managed by Elastic Agent, such as Elastic Endpoint, are also removed. If uninstalling fails, Elastic Endpoint might remain on your system.

To remove Elastic Endpoint, run the following commands:

macOS

		cd /tmp
cp /Library/Elastic/Endpoint/elastic-endpoint elastic-endpoint
sudo ./elastic-endpoint uninstall
rm elastic-endpoint
		
	

Linux

		cd /tmp
cp /opt/Elastic/Endpoint/elastic-endpoint elastic-endpoint
sudo ./elastic-endpoint uninstall
rm elastic-endpoint
		
	

Windows

		cd %TEMP%
copy "c:\Program Files\Elastic\Endpoint\elastic-endpoint.exe" elastic-endpoint.exe
.\elastic-endpoint.exe uninstall
del .\elastic-endpoint.exe
		
	

Elastic Agent status

Retrieve the Elastic Agent version

If you installed the Elastic Agent, run the following command (the example is for POSIX based systems):
```
elastic-agent version
		
```
If you have not installed the Elastic Agent and you are running it as a temporary process, you can run:
```
./elastic-agent version
		
```
Note

Both of the above commands are accessible via Windows or macOS with their OS-specific slight variation in how you call them. If needed, refer to Install Elastic Agents for examples of how to adjust them.

Check the Elastic Agent status

Run the following command to view the current status of the Elastic Agent.

elastic-agent status

Based on the information returned, you can take further action.

If Elastic Agent is running, but you do not get what you expect, here are some items to review:

In Fleet, click Agents. Check which policy is associated with the running Elastic Agent. If it is not the policy you expected, you can change it.
In Fleet, click Agents, and then select the Elastic Agent policy. Check for the integrations that should be included.

For example, if you want to include system data, make sure the System integration is included in the policy.
Confirm if the Collect agent logs and Collect agent metrics options are selected.
1. In Fleet, click Agents, and then select the Elastic Agent policy.
2. Select the Settings tab. If you want to collect agent logs or metrics, select these options.
  
  Important
  
  The Elastic Cloud agent policy is created only in Elastic Cloud deployments and, by default, does not include the collection of logs of metrics.

Some problems occur so early that insufficient logging is available

If some problems occur early and insufficient logging is available, run the following command:

./elastic-agent install -f

The stand-alone install command installs the Elastic Agent, and all of the service configuration is set up. You can now run the enrollment command. For example:

		elastic-agent enroll --fleet-server-es=https://<es-url>:443 --fleet-server-service-token=<token> --fleet-server-policy=<policy-id>
		
	

Note: Port 443 is commonly used in Elastic Cloud. However, with self-managed deployments, your Elasticsearch might run on port 9200 or something entirely different.

For information on where to find agent logs, refer to our FAQ.

Elastic Agent is cited as `Healthy` but still has set up problems sending data to Elasticsearch

To confirm that the Elastic Agent is running and its status is Healthy, select the Agents tab.

If you previously selected the Collect agent logs option, you can now look at the agent logs.
Click the agent name and then select the Logs tab.

If there are no logs displayed, it suggests a communication problem between your host and Elasticsearch. The possible reason for this is that the port is already in use.
You can check the port usage using tools like Wireshark or netstat. On a POSIX system, you can run the following command:
```
netstat -nat | grep :8220
		
```
Any response data indicates that the port is in use. This could be correct or not if you had intended to uninstall the Fleet Server. In which case, re-check and continue.

Elastic Agent is stuck in status `Updating`

A stuck Elastic Agent upgrade should be detected automatically, and you can restart the upgrade from Fleet.

Authentication and access

Elasticsearch authentication service fails with `Authentication using apikey failed` message

To save API keys and encrypt them in Elasticsearch, Fleet requires an encryption key.

To provide an API key, in the kibana.yml configuration file, set the xpack.encryptedSavedObjects.encryptionKey property.

xpack.encryptedSavedObjects.encryptionKey: "something_at_least_32_characters"

Elastic Agent fails with `Agent process is not root/admin or validation failed` message

Ensure the user running Elastic Agent has root privileges as some integrations require root privileges to collect sensitive data.

If you’re running Elastic Agent in the foreground (and not as a service) on Linux or macOS, run the agent under the root user: sudo or su.

If you’re using the Elastic Defend integration, make sure you’re running Elastic Agent under the SYSTEM account.

Tip

If you install Elastic Agent as a service as described in Install Elastic Agents, Elastic Agent runs under the SYSTEM account by default.

To run Elastic Agent under the SYSTEM account, you can do the following:

Download PsExec and extract the contents to a folder. For example, d:\tools.
Open a command prompt as an Administrator (right-click the command prompt icon and select Run As Administrator).

From the command prompt, run Elastic Agent under the SYSTEM account:

d:\tools\psexec.exe -sid "C:\Program Files\Elastic-Agent\elastic-agent.exe" run

API key is unauthorized to send telemetry to `.logs-endpoint.diagnostic.collection-*` indices

By default, telemetry is turned on in the Elastic Stack to helps us learn about the features that our users are most interested in. This helps us to focus our efforts on making features even better.

If you’ve recently upgraded from version 7.10 to 7.11, you might see the following message when you view Elastic Defend logs:

		action [indices:admin/auto_create] is unauthorized for API key id [KbvCi3YB96EBa6C9k2Cm]
of user [fleet_enroll] on indices [.logs-endpoint.diagnostic.collection-default]

The above message indicates that Elastic Endpoint does not have the correct permissions to send telemetry. This is a known problem in 7.11 that will be fixed in an upcoming patch release.

To remove this message from your logs, you can turn off telemetry for the Elastic Defend integration until the next patch release is available.

In Kibana, click Integrations, and then select the Manage tab.
Click Elastic Defend, and then select the Policies tab to view all the installed integrations.
Click the integration to edit it.
Under advanced settings, set windows.advanced.diagnostic.enabled to false, and then save the integration.

Error when running Elastic Agent commands with `sudo`

On Linux systems, when you install Elastic Agent without administrative privileges, that is, using the --unprivileged flag, Elastic Agent commands should not be run with sudo. Doing so can result in an error due to the agent not having the required privileges.

For example, when you run Elastic Agent with the --unprivileged flag, running the elastic-agent inspect command will result in an error like the following:

		Error: error loading agent config: error loading raw config: fail to read configuration /Library/Elastic/Agent/fleet.enc for the elastic-agent: fail to decode bytes: cipher: message authentication failed
		
	

To resolve this, either install Elastic Agent without the --unprivileged flag so that it has administrative access, or run the Elastic Agent commands without the sudo prefix.

Fleet Server and Elastic Agent

On Fleet Server startup, ERROR seen with `State changed to CRASHED: exited with code: 1`

You might get this error message for a number of different reasons. A common reason is when attempting production-like usage and the ca.crt file passed in cannot be found. To verify if this is the problem, bootstrap Fleet Server without passing a ca.crt file. This implies you would test any subsequent Elastic Agent installs temporarily with Fleet Server's own self-signed cert.

Tip

Ensure to pass in the full path to the ca.crt file. A relative path is not viable.

You will know if your Fleet Server is set up with its testing oriented self-signed certificate usage, when you see the following error during Elastic Agent installs:

		Error: fail to enroll: fail to execute request to fleet-server: x509: certificate signed by unknown authority
Error: enroll command failed with exit code: 1

To install or enroll against a self-signed cert Fleet Server Elastic Agent, add in the --insecure option to the command:

		sudo ./elastic-agent install --url=https://<fleet-server-ip>:8220 --enrollment-token=<token> --insecure
		
	

For more information, refer to Elastic Agent enrollment fails on the host with x509: certificate signed by unknown authority message.

Fleet Server is running and healthy with data, but other Agents cannot use it to connect to Elasticsearch

Some settings are only used when you have multiple Elastic Agents. If this is the case, check to be sure that the hosts can communicate with the Fleet Server.

From the non-Fleet Server host, run the following command:

curl -f http://<fleet-server-ip>:8220/api/status

The response might yield errors that you can debug further, or it might work and show that communication ports and networking are not the problems.

One common problem is that the default Fleet Server port of 8220 isn’t open on the Fleet Server host to communicate. You can review and correct this using common tools in alignment with any networking and security concerns that you have.

Elastic Agent and integrations

Integration policy upgrade has too many conflicts

If you try to upgrade an integration policy that is several versions old, there might be substantial conflicts or configuration issues. You might save time by creating a new policy, testing it, and rolling out the integration upgrade to additional hosts rather than trying to fix these problems.

After upgrading the integration:

Create a new policy.
Add the integration to the policy. The later version is automatically used.
Apply the policy to an Elastic Agent.

Tip

In larger deployments, you should test integration upgrades on a sample Elastic Agent before rolling out a larger upgrade initiative. Only after a small trial is deemed successful should the updated policy be rolled out all hosts.
Roll out the integration update to additional hosts:
1. In Fleet, click Agent policies. Click on the name of the policy you want to edit.
2. Search or scroll to a specific integration. Open the Actions menu and select Delete integration.
3. Click Add integration and re-add the freshly deleted integration. The updated version will be used and applied to all Elastic Agents.
4. Repeat this process for each policy with the out-of-date integration.
  
  Note
  
  In some instances, for example, when there are hundreds or thousands of different Elastic Agents and policies that need to be updated, this upgrade path is not feasible. In this case, update one policy and use the Copy a policy action to apply the updated policy versions to additional policies. This method’s downside is losing the granularity of assessing the individual Integration version changes individually across policies.

Elastic Agents are unable to connect after removing the Fleet Server integration

When you use Fleet-managed Elastic Agent, at least one Elastic Agent needs to be running the Fleet Server integration. In case the policy containing this integration is accidentally removed from Elastic Agent, all other agents will not be able to be managed. However, the Elastic Agents will continue to send data to their configured output.

There are two approaches to fixing this issue, depending on whether or not the the Elastic Agent that was running the Fleet Server integration is still installed and healthy (but is now running another policy).

To recover the Elastic Agent:

In Fleet, open the Agents tab and click Add agent.
In the Add agent flyout, select an agent policy that contains the Fleet Server integration. On Elastic Cloud you can use the Elastic Cloud agent policy which includes the integration.
Follow the instructions in the flyout, and stop before running the CLI commands.
Depending on the state of the original Fleet Server Elastic Agent, do one of the following:
- The original Fleet Server Elastic Agent is still running and healthy
  
  In this case, you only need to re-enroll the agent with Fleet:
  1. Copy the elastic-agent install command from the Kibana UI.
  2. In the command, replace install with enroll.
  3. In the directory where Elastic Agent is running (for example /opt/Elastic/Agent/ on Linux), run the command as root.
    
    For example, if Kibana gives you the command:
    sudo ./elastic-agent install --url=https://fleet-server:8220 --enrollment-token=bXktc3VwZXItc2VjcmV0LWVucm9sbWVudC10b2tlbg==
    Instead run:
    sudo ./elastic-agent enroll --url=https://fleet-server:8220 --enrollment-token=bXktc3VwZXItc2VjcmV0LWVucm9sbWVudC10b2tlbg==
- The original Fleet Server Elastic Agent is no longer installed
  
  In this case, you need to install the agent again:
  1. Copy the commands from the Kibana UI. The commands don’t need to be changed.
  2. Run the commands in order. The first three commands will download a new Elastic Agent install package, expand the archive, and change directories.
    
    The final command will install Elastic Agent. For example:
    sudo ./elastic-agent install --url=https://fleet-server:8220 --enrollment-token=bXktc3VwZXItc2VjcmV0LWVucm9sbWVudC10b2tlbg==

After running these steps your Elastic Agents should be able to connect with Fleet again.

illegal_argument_exception when TSDB is enabled

When you use an Elastic Agent integration in which TSDB (Time Series Database) is enabled, you might encounter an illegal_argument_exception error in the Fleet UI.

This can occur if you have a component template defined that includes a _source attribute, which conflicts with the _source: synthetic setting used when TSDB is enabled.

For details about the error and how to resolve it, refer to the section Runtime fields cannot be used in TSDB indices in the Innovation Hub article TSDB enabled integrations for Elastic Agent.

The `/api/fleet/setup` endpoint can’t reach the package registry to install Integrations

To install Integrations, the Fleet app requires a connection to an external service called the Elastic Package Registry.

For this to work, the Kibana server must connect to https://epr.elastic.co on port 443.

OpenTelemetry Collectors in Fleet

OTel Collector doesn't appear in the Fleet UI

Symptoms

Your OTel Collector is running but doesn't appear in the Fleet Agents list, or you see OpAMP-related errors in the collector logs.

Resolution

Verify the OpAMP configuration in your OTel Collector configuration file:
```
extensions:
  opamp:
    server:
      http:
        endpoint: https://fleet-server:8220/v1/opamp
        headers:
          Authorization: ApiKey <fleet-enrollment-api-key>
    instance_uid: <instance-uid>

service:
  extensions: [opamp]
		
```
Ensure:
- The endpoint URL includes the /v1/opamp path
- The instance_uid is a valid UUID v7
- The enrollment API key is correct
Note

On Elastic Cloud Hosted and Serverless Observability projects, the Fleet Server URL is provided by the platform. Find it in the Add collector flyout in Fleet, or in the Fleet Server hosts section on the Fleet → Settings page. For example, https://<fleet-server-host-url>/v1/opamp.
Check the collector logs for OpAMP errors:
```
error opampextension ... OpAMP server returned an error response
		
```
If you see enrollment errors, make sure you're using the Fleet enrollment API key provided in the UI when you start adding an OTel Collector.
Test network connectivity to Fleet Server:
```
curl -v https://fleet-server:8220/v1/opamp
		
```
You receive a response from the server. If the connection fails, verify firewall rules and network access.
Restart the collector after making configuration changes to apply the new settings.

Internal telemetry metrics don't appear in Kibana

Symptoms

Your OTel Collector appears in Fleet, but CPU and memory usage aren't displayed, or the OTel Collector internal telemetry dashboards show no data.

Resolution

Internal telemetry uses a self-loop pattern: the collector emits its own metrics, logs, and traces to an OTLP receiver on the collector itself, and a pipeline forwards that data to a backend. To make internal telemetry appear in Kibana, extend your existing collector configuration with the following components:

Configure telemetry export in your service.telemetry section:

		service:
  telemetry:
    resource:
      service.instance.id: "<your-instance-uid>"
    metrics:
      level: detailed
      readers:
        - periodic:
            interval: 3000
            exporter:
              otlp:
                protocol: grpc
                endpoint: http://localhost:4317
    logs:
      processors:
        - batch:
            exporter:
              otlp:
                protocol: grpc
                endpoint: http://localhost:4317
		
	

Important

Use http://localhost:4317 (with the http:// prefix) to ensure plaintext gRPC communication with your receiver.

Verify the OTLP receiver is configured:

		receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
		
	

Add an exporter that sends telemetry to your Elasticsearch backend. If your collector already exports telemetry, you can reuse the existing exporter and add the OTLP receiver to its pipelines. Otherwise, configure the elasticsearch/otel exporter:
```
exporters:
  elasticsearch/otel:
    endpoints: [https://elasticsearch:9200]
    api_key: "<your-api-key>"
    mapping:
      mode: otel

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [elasticsearch/otel]
    logs:
      receivers: [otlp]
      exporters: [elasticsearch/otel]
		
```
Note

On Serverless Observability projects and Elastic Cloud Hosted deployments, you can replace the elasticsearch/otel exporter with the Elastic Cloud Managed OTLP Endpoint, which accepts OTLP data directly and uses an APM-scoped API key. For more details, refer to Elastic Cloud Managed OTLP Endpoint (mOTLP).
Restart the collector and verify that internal telemetry appears in Kibana by searching for service.instance.id: "<your-instance-uid>" in Discover.

TLS handshake errors when exporting telemetry

Symptoms

You see errors like:

		tls: first record does not look like a TLS handshake
connection refused

Resolution

This occurs when the gRPC client expects TLS but the receiver uses plaintext. To fix the issue:

For internal metrics/logs sent to the collector's own receiver (self-loop telemetry), use the http:// prefix:

		service:
  telemetry:
    metrics:
      readers:
        - periodic:
            exporter:
              otlp:
                endpoint: http://localhost:4317
		
	

http:// = plaintext

Alternatively, if not using the http:// prefix, explicitly turn off TLS:

		service:
  telemetry:
    metrics:
      readers:
        - periodic:
            exporter:
              otlp:
                endpoint: localhost:4317
                tls:
                  insecure: true
		
	

Turns off TLS

Important

For external endpoints (for example, sending to Elastic Cloud Hosted), always use https://. Never use tls.insecure: true for external endpoints. Using this setting turns off TLS entirely, transmitting data unencrypted and exposing your connection to interception. To skip only certificate verification (still discouraged for production), use tls.insecure_skip_verify: true instead.

Authentication errors when exporting to Elasticsearch

Symptoms

You see errors like:

		flush failed (401): unauthorized
flush failed (403): security_exception

Resolution

The Elasticsearch exporter requires proper authentication and permissions:

Create an API key:
1. In Kibana, enter API keys in the global search field.
2. Click Create API key.
3. Use an API key with default privileges or, optionally, configure specific privileges for the key:
```
{
  "otel_writer": {
    "cluster": ["monitor", "manage_ilm"],
    "indices": [
      {
        "names": ["metrics-*", "logs-*", "traces-*"],
        "privileges": ["create_index", "write", "auto_configure"]
      }
    ]
  }
}
		
```
  Tip
  
  The privileges shown allow writing to all metrics-*, logs-*, and traces-* data streams. To scope access more narrowly, replace the names patterns with the specific data streams you ingest — for example, metrics-otel-<your-namespace>-*, logs-otel-<your-namespace>-*, and traces-otel-<your-namespace>-*.
4. Copy the encoded API key value.

Update your exporter configuration:

		exporters:
  elasticsearch/otel:
    endpoints: [https://elasticsearch:9200]
    api_key: "<your-base64-encoded-api-key>"
		
	

Restart the collector and verify that data flows without authentication errors.

EDOT Collector fails to start with permission denied error

Symptoms

When running the EDOT Collector standalone, you see:

		error creating listener: listen unix /tmp/elastic-agent/...: bind: permission denied
		
	

Resolution

The elastic_diagnostics extension requires a writable directory. To fix this issue:

Create the required directory:

		sudo mkdir -p /tmp/elastic-agent
sudo chmod 755 /tmp/elastic-agent

Configure the extension in your collector configuration:

		extensions:
  elastic_diagnostics:
    endpoint: localhost:8888
  opamp:
    # ... your OpAMP config
		
	

Restart the collector. The elastic_diagnostics extension should now start successfully.

TLS certificate verification errors when connecting to Fleet Server

Symptoms

Your OTel Collector can't connect to Fleet Server and you see TLS certificate verification errors in the logs:

		x509: certificate signed by unknown authority
tls: failed to verify certificate

Resolution

When Fleet Server uses a self-signed certificate or a certificate from a non-public Certificate Authority (CA), you need to configure the OpAMP extension to trust it.

Option 1: Provide the CA certificate (recommended)

Obtain the CA certificate that signed your Fleet Server certificate, and save it to a file (for example, ca.crt).

Configure the OpAMP extension to use the CA certificate:

		extensions:
  opamp:
    server:
      http:
        endpoint: https://fleet-server:8220/v1/opamp
        tls:
          ca_file: /path/to/ca.crt
        headers:
          Authorization: ApiKey <fleet-enrollment-api-key>
    instance_uid: <instance-uid>

service:
  extensions: [opamp]
		
	

Restart the collector to apply the changes.

Option 2: Skip certificate verification (testing only)

For rapid prototyping or testing purposes only, you can skip certificate verification. Do not use this in production environments.

		extensions:
  opamp:
    server:
      http:
        endpoint: https://fleet-server:8220/v1/opamp
        tls:
          insecure_skip_verify: true
        headers:
          Authorization: ApiKey <fleet-enrollment-api-key>
    instance_uid: <instance-uid>

service:
  extensions: [opamp]
		
	

WARNING: Skips certificate verification

Warning

Using insecure_skip_verify: true skips TLS certificate verification and makes your connection vulnerable to man-in-the-middle attacks. Only use this for testing in isolated environments.

For more details on TLS configuration, refer to Configure TLS for Fleet Server connection.

Elastic Cloud and Kibana

Fleet in Kibana crashes

To investigate the error, open your browser’s development console.
Select the Network tab, and refresh the page.

One of the requests to the Fleet API will most likely have returned an error. If the error message doesn’t give you enough information to fix the problem, contact us in the discuss forum.

Hosted Elastic Agent is offline

To scale the Fleet Server deployment, Elastic Cloud starts new containers or shuts down old ones when hosted Elastic Agents are required or no longer needed. The old Elastic Agents will show in the Agents list for 24 hours then automatically disappear.

Elastic Agents hosted on Elastic Cloud are stuck in `Updating` or `Offline`

In Elastic Cloud, after upgrading Fleet Server and its integration policies, agents enrolled in the Elastic Cloud agent policy might experience issues updating. To resolve this problem:

In a terminal window, run this cURL request, providing your Kibana superuser credentials to reset the Elastic Cloud agent policy:

		curl -u <username>:<password> --request POST \
  --url <kibana_url>/internal/fleet/reset_preconfigured_agent_policies/policy-elastic-agent-on-cloud \
  --header 'content-type: application/json' \
  --header 'kbn-xsrf: xyz' \
  --header 'elastic-api-version: 2023-10-31'
		
	

Force unenroll the agent stuck in Updating:

To find agent’s ID, go to Fleet > Agents and click the agent to see its details. Copy the Agent ID.

In a terminal window, run:

		curl -u <username>:<password> --request POST \
  --url <kibana_url>/api/fleet/agents/<agentID>/unenroll \
  --header 'content-type: application/json' \
  --header 'kbn-xsrf: xx' \
  --data-raw '{"force":true,"revoke":true}' \
  --compressed
		
	

Where <agentID> is the ID you copied in the previous step.

Restart the Integrations Server:

In the Elastic Cloud console under Integrations Server, click Force Restart.

When using Elastic Cloud, Fleet Server is not listed in Kibana

If Fleet Server does not appear in Kibana, make sure that it’s set up.

To set up Fleet Server on Elastic Cloud:

Go to your deployment on Elastic Cloud.
Follow the Elastic Cloud prompts to set up Integrations Server. Once complete, the Fleet Server Elastic Agent will show up in Fleet.

To enable Fleet and set up Fleet Server on a self-managed cluster:

In the Elasticsearch configuration file, config/elasticsearch.yml, set the following security settings to enable security and API keys:
```
xpack.security.enabled: true
xpack.security.authc.api_key.enabled: true
		
```
In the Kibana configuration file, config/kibana.yml, enable Fleet and specify your user credentials:
```
xpack.encryptedSavedObjects.encryptionKey: "something_at_least_32_characters"
elasticsearch.username: "my_username"
elasticsearch.password: "my_password"
		
```
1. Specify a user who is authorized to use Fleet.
To set up passwords, you can use the documented Elasticsearch APIs or the elasticsearch-setup-passwords command. For example, ./bin/elasticsearch-setup-passwords auto

After running the command:
1. Copy the Elastic user name to the Kibana configuration file.
2. Restart Kibana.
3. Follow the documented steps for setting up a self-managed Fleet Server. For more information, refer to What is Fleet Server?.

Elastic Agent on Kubernetes

Elastic Agent Out of Memory errors on Kubernetes

In a Kubernetes environment, Elastic Agent might be stopped with reason OOMKilled due to inadequate available memory.

To detect the problem, run the kubectl describe pod command and check the results for the following content:

		Last State:   Terminated
Reason:       OOMKilled
Exit Code:    137
		
	

To resolve the problem, allocate additional memory to the agent and then restart it.

Troubleshoot Elastic Agent installation on Kubernetes, with Kustomize

Potential issues during Elastic Agent installation on Kubernetes can be categorized into two main areas:

Problems related to the creation of objects within the manifest.
Failures occurring within specific components after installation.

Problems related to the creation of objects within the manifest

When troubleshooting installations performed with Kustomize, it’s good practice to inspect the output of the rendered manifest. To do this, take the installation command provided by Kibana Onboarding and replace the final part, | kubectl apply -f-, with a redirection to a local file. This allows for easier analysis of the rendered output.

For example, the following command, originally provided by Kibana for an Elastic Agent Standalone installation, has been modified to redirect the output for troubleshooting purposes:

		kubectl kustomize https://github.com/elastic/elastic-agent/deploy/kubernetes/elastic-agent-kustomize/default/elastic-agent-standalone\?ref\=v8.15.3 | sed -e 's/JUFQSV9LRVkl/ZDAyNnZaSUJ3eWIwSUlCT0duRGs6Q1JfYmJoVFRUQktoN2dXTkd0FNMtdw==/g' -e "s/%ES_HOST%/https:\/\/7a912e8674a34086eacd0e3d615e6048.us-west2.gcp.elastic-cloud.com:443/g" -e "s/%ONBOARDING_ID%/db687358-2c1f-4ec9-86e0-8f1baa4912ed/g" -e "s/\(docker.elastic.co\/beats\/elastic-agent:\).*$/\18.15.3/g" -e "/{CA_TRUSTED}/c\ " > elastic_agent_installation_complete_manifest.yaml
		
	

The previous command generates a local file named elastic_agent_installation_complete_manifest.yaml, which you can use for further analysis. It contains the complete set of resources required for the Elastic Agent installation, including:

RBAC objects (ServiceAccounts, Roles, etc.)
ConfigMaps and Secrets for Elastic Agent configuration
Elastic Agent Standalone deployed as a DaemonSet
Kube-state-metrics deployed as a Deployment

The content of this file is equivalent to what you’d obtain by following the Run Elastic Agent Standalone on Kubernetes steps, with the exception that kube-state-metrics is not included in the standalone method.

Possible issues

If your user doesn’t have cluster-admin privileges, the RBAC resources creation might fail.
Some Kubernetes security mechanisms (like Pod Security Standards) could cause part of the manifest to be rejected, as hostNetwork access and hostPath volumes are required.
If you already have an installation of kube-state-metrics, it could cause part of the manifest installation to fail or to update your existing resources without notice.

Failures occurring within specific components after installation

If the installation is correct and all resources are deployed, but data is not flowing as expected (for example, you don’t see any data on the [Metrics Kubernetes] Cluster Overview dashboard), check the following items:

Check resources status and ensure they are all in a Running state:
```
kubectl get pods -n kube-system | grep elastic
kubectl get pods -n kube-system | grep kube-state-metrics
		
```
Note

The default configuration assumes that both kube-state-metrics and the Elastic Agent DaemonSet are deployed in the same namespace for communication purposes. If you change the namespace of any of the components, the agent configuration will need further policy updates.

Describe the Pods if they are in a Pending state:

kubectl describe -n kube-system <name_of_elastic_agent_pod>

Check the logs of elastic-agents and kube-state-metrics, and look for errors or warnings:

		kubectl logs -n kube-system <name_of_elastic_agent_pod>
kubectl logs -n kube-system <name_of_elastic_agent_pod> | grep -i error
kubectl logs -n kube-system <name_of_elastic_agent_pod> | grep -i warn
		
	

kubectl logs -n kube-system <name_of_kube-state-metrics_pod>

Possible issues

Connectivity, authorization, or authentication issues when connecting to Elasticsearch:

Ensure the API Key and Elasticsearch destination endpoint used during the installation is correct and is reachable from within the Pods.

In an already installed system, the API Key is stored in a Secret named elastic-agent-creds-<hash>, and the endpoint is configured in the ConfigMap elastic-agent-configs-<hash>.
Missing cluster-level metrics (provided by kube-state-metrics):

As described in Run Elastic Agent Standalone on Kubernetes, the Elastic Agent Pod acting as leader is responsible for retrieving cluster-level metrics from kube-state-metrics and delivering them to data streams prefixed as metrics-kubernetes.state_<resource>. In order to troubleshoot a situation where these metrics are not appearing:
1. Determine which Pod owns the leadership lease in the cluster, with:
```
kubectl get lease -n kube-system elastic-agent-cluster-leader
		
```
2. Check the logs of that Pod to see if there are errors when connecting to kube-state-metrics and if the state_* metrics are being sent to Elasticsearch.
  
  One way to check if state_* metrics are being delivered to Elasticsearch is to inspect log lines with the "Non-zero metrics in the last 30s" message and check the values of the state_* metrics within the line, with something like:
```
kubectl logs -n kube-system elastic-agent-xxxx | grep "Non-zero metrics" | grep "state_"
		
```
  If the previous command returns "state_pod":{"events":213,"success":213} or similar for all state_* metrics, it means the metrics are being delivered.
3. As a last resort, if you believe none of the Pods is acting as a leader, you can try deleting the lease to generate a new one:
```
kubectl delete lease -n kube-system elastic-agent-cluster-leader
# wait a few seconds and check for the lease again
kubectl get lease -n kube-system elastic-agent-cluster-leader
		
```
Performance problems:

Monitor the CPU and Memory usage of the agents Pods and adjust the manifest requests and limits as needed. Refer to Scaling Elastic Agent on Kubernetes for more details about the needed resources.

Extra resources for Elastic Agent on Kubernetes troubleshooting and information:

Elastic Agent Out of Memory errors on Kubernetes.
Elastic Agent Kustomize Templates documentation and resources.
Other examples and manifests to deploy Elastic Agent on Kubernetes.

Troubleshoot Elastic Agent on Kubernetes seeing `invalid api key to authenticate with fleet` in logs

If an agent was unenrolled from a Kubernetes cluster, there might be data remaining in /var/lib/elastic-agent-managed/kube-system/state on the node(s). Reenrolling an agent later on the same nodes might then result in invalid api key to authenticate with fleet error messages.

To avoid these errors, make sure to delete this state-folder before enrolling a new agent.

For more information, refer to issue #3586.

Air-gapped environments

Kibana cannot connect to Elastic Package Registry in air-gapped environments

In air-gapped environments, you might encounter an error if you’re using a custom Certificate Authority (CA) that is not available to Kibana:

		{"type":"log","@timestamp":"2022-03-02T09:58:36-05:00","tags":["error","plugins","fleet"],"pid":58716,"message":"Error connecting to package registry: request to https://customer.server.name:8443/categories?experimental=true&include_policy_templates=true&kibana.version=7.17.0 failed, reason: self signed certificate in certificate chain"}
		
	

To fix this problem, add your CA certificate file path to the Kibana startup file by defining the NODE_EXTRA_CA_CERTS environment variable. More information about this in TLS configuration of the Elastic Package Registry section.