﻿---
title: Entity Analytics Input
description: The Entity Analytics input collects identity assets, such as users, from external identity providers. The following identity providers are supported: 
url: https://www.elastic.co/docs/reference/beats/filebeat/filebeat-input-entity-analytics
products:
  - Beats
  - Filebeat
applies_to:
  - Elastic Stack: Preview
---

# Entity Analytics Input
<warning>
  This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
</warning>

The Entity Analytics input collects identity assets, such as users, from external identity providers.
The following identity providers are supported:
- [Active Directory (`activedirectory`)](#provider-activedirectory)
- [Azure Active Directory (`azure-ad`)](#provider-azure-ad)
- [Jamf Computer Management (`jamf`)](#provider-jamf)
- [Okta User Identities (`okta`)](#provider-okta)


## Configuration options

The `entity-analytics` input supports the following configuration options plus the [Common options](#filebeat-input-entity-analytics-common-options) described later.

### `provider`

The identity provider. Must be one of: `activedirectory`, `azure-ad` or `okta`.

## Common options

The following configuration options are supported by all inputs.

#### `enabled`

Use the `enabled` option to enable and disable inputs. By default, enabled is set to true.

#### `tags`

A list of tags that Filebeat includes in the `tags` field of each published event. Tags make it easy to select specific events in Kibana or apply conditional filtering in Logstash. These tags will be appended to the list of tags specified in the general configuration.
Example:
```yaml
filebeat.inputs:
- type: entity-analytics
  . . .
  tags: ["json"]
```


#### `fields`

Optional fields that you can specify to add additional information to the output. For example, you might add fields that you can use for filtering log data. Fields can be scalar values, arrays, dictionaries, or any nested combination of these. By default, the fields that you specify here will be grouped under a `fields` sub-dictionary in the output document. To store the custom fields as top-level fields, set the `fields_under_root` option to true. If a duplicate field is declared in the general configuration, then its value will be overwritten by the value declared here.
```yaml
filebeat.inputs:
- type: entity-analytics
  . . .
  fields:
    app_id: query_engine_12
```


#### `fields_under_root`

If this option is set to true, the custom [fields](#filebeat-input-entity-analytics-fields) are stored as top-level fields in the output document instead of being grouped under a `fields` sub-dictionary. If the custom field names conflict with other field names added by Filebeat, then the custom fields overwrite the other fields.

#### `processors`

A list of processors to apply to the input data.
See [Processors](https://www.elastic.co/docs/reference/beats/filebeat/filtering-enhancing-data) for information about specifying processors in your config.

#### `pipeline`

The ingest pipeline ID to set for the events generated by this input.
<note>
  The pipeline ID can also be configured in the Elasticsearch output, but this option usually results in simpler configuration files. If the pipeline is configured both in the input and output, the option from the input is used.
</note>

<important>
  The `pipeline` is always lowercased. If `pipeline: Foo-Bar`, then the pipeline name in Elasticsearch needs to be defined as `foo-bar`.
</important>


#### `keep_null`

If this option is set to true, fields with `null` values will be published in the output document. By default, `keep_null` is set to `false`.

#### `index`

If present, this formatted string overrides the index for events from this input (for elasticsearch outputs), or sets the `raw_index` field of the event’s metadata (for other outputs). This string can only refer to the agent name and version and the event timestamp; for access to dynamic fields, use `output.elasticsearch.index` or a processor.
Example value: `"%{[agent.name]}-myindex-%{+yyyy.MM.dd}"` might expand to `"filebeat-myindex-2019.11.01"`.

#### `publisher_pipeline.disable_host`

By default, all events contain `host.name`. This option can be set to `true` to disable the addition of this field to all events. The default value is `false`.

## Providers


## Active Directory (`activedirectory`)

The `activedirectory` provider allows the input to retrieve users and devices, with group memberships, from Active Directory.

### Setup

A user with appropriate permissions must be set up in the Active Directory Server Manager in order for the provider to function properly.

### How It Works


#### Overview

The Active Directory provider periodically:
- Queries the Active Directory server, retrieving updates for users, devices, and groups.
- Updates its internal cache of user, device, and group metadata and group membership information.
- Ships updated user and device metadata to Elasticsearch.

Fetching and shipping updates occurs in one of two processes: **full synchronizations** and **incremental updates**. Full synchronizations will send the entire list of users, devices, and group membership in state, along with write markers to indicate the start and end of the synchronization event. Incremental updates will only send data for changed users and devices during that event. Changes on a user or device can come in many forms, whether it be a change to the user or device metadata, a user or device was added or modified, or group membership was changed.

#### Sending User and Device Metadata to Elasticsearch

During a full synchronization, all users, devices, and groups stored in state will be sent to the output, while incremental updates will only send users, devices, and groups that have been updated. Full synchronizations will be bounded on either side by write marker documents, which will look something like this:
```json
{
    "@timestamp": "2022-11-04T09:57:19.786056-05:00",
    "event": {
        "action": "started",
        "start": "2022-11-04T09:57:19.786056-05:00"
    },
    "labels": {
        "identity_source": "activedirectory-1"
    }
}
```

User documents will show the current state of the user.
Example user document:
```json
{
    "@timestamp": "2024-02-05T06:37:40.876026-05:00",
    "event": {
        "action": "user-discovered",
    },
    "activedirectory": {
        "id": "CN=Guest,CN=Users,DC=testserver,DC=local",
        "user": {
            "accountExpires": "2185-07-21T23:34:33.709551516Z",
            "badPasswordTime": "0",
            "badPwdCount": "0",
            "cn": "Guest",
            "codePage": "0",
            "countryCode": "0",
            "dSCorePropagationData": [
                "2024-01-22T06:37:40Z",
                "1601-01-01T00:00:01Z"
            ],
            "description": "Built-in account for guest access to the computer/domain",
            "distinguishedName": "CN=Guest,CN=Users,DC=testserver,DC=local",
            "instanceType": "4",
            "isCriticalSystemObject": true,
            "lastLogoff": "0",
            "lastLogon": "2185-07-21T23:34:33.709551616Z",
            "logonCount": "0",
            "memberOf": "CN=Guests,CN=Builtin,DC=testserver,DC=local",
            "name": "Guest",
            "objectCategory": "CN=Person,CN=Schema,CN=Configuration,DC=testserver,DC=local",
            "objectClass": [
                "top",
                "person",
                "organizationalPerson",
                "user"
            ],
            "objectGUID": "hSt/40XJQU6cf+J2XoYMHw==",
            "objectSid": "AQUAAAAAAAUVAAAA0JU2Fq1k30YZ7UPx9QEAAA==",
            "primaryGroupID": "514",
            "pwdLastSet": "2185-07-21T23:34:33.709551616Z",
            "sAMAccountName": "Guest",
            "sAMAccountType": "805306368",
            "uSNChanged": "8197",
            "uSNCreated": "8197",
            "userAccountControl": "66082",
            "whenChanged": "2024-01-22T06:36:59Z",
            "whenCreated": "2024-01-22T06:36:59Z"
        },
        "whenChanged": "2024-01-22T06:36:59Z"
    },
    "user": {
        "id": "CN=Guest,CN=Users,DC=testserver,DC=local"
    },
    "labels": {
        "identity_source": "activedirectory-1"
    }
}
```

Device documents will show the current state of the device.
Example device document:
```json
{
    "@timestamp": "2024-02-05T06:37:40.876026-05:00",
    "event": {
        "action": "device-discovered",
    },
    "activedirectory": {
        "id": "CN=DESKTOP-ABC123,CN=Computers,DC=testserver,DC=local",
        "user": {
            "accountExpires": "2185-07-21T23:34:33.709551516Z",
            "badPasswordTime": "0",
            "badPwdCount": "0",
            "cn": "DESKTOP-ABC123",
            "codePage": "0",
            "countryCode": "0",
            "dSCorePropagationData": [
                "2024-01-22T06:37:40Z",
                "1601-01-01T00:00:01Z"
            ],
            "description": "Computer account",
            "distinguishedName": "CN=DESKTOP-ABC123,CN=Computers,DC=testserver,DC=local",
            "instanceType": "4",
            "isCriticalSystemObject": false,
            "lastLogoff": "0",
            "lastLogon": "2185-07-21T23:34:33.709551616Z",
            "logonCount": "0",
            "memberOf": "CN=Domain Computers,CN=Users,DC=testserver,DC=local",
            "name": "DESKTOP-ABC123",
            "objectCategory": "CN=Computer,CN=Schema,CN=Configuration,DC=testserver,DC=local",
            "objectClass": [
                "top",
                "person",
                "organizationalPerson",
                "user",
                "computer"
            ],
            "objectGUID": "hSt/40XJQU6cf+J2XoYMHw==",
            "objectSid": "AQUAAAAAAAUVAAAA0JU2Fq1k30YZ7UPx9QEAAA==",
            "operatingSystem": "Windows 10 Enterprise",
            "operatingSystemVersion": "10.0 (19041)",
            "primaryGroupID": "515",
            "pwdLastSet": "2185-07-21T23:34:33.709551616Z",
            "sAMAccountName": "DESKTOP-ABC123$",
            "sAMAccountType": "805306369",
            "uSNChanged": "8197",
            "uSNCreated": "8197",
            "userAccountControl": "4096",
            "whenChanged": "2024-01-22T06:36:59Z",
            "whenCreated": "2024-01-22T06:36:59Z"
        },
        "whenChanged": "2024-01-22T06:36:59Z"
    },
    "device": {
        "id": "CN=DESKTOP-ABC123,CN=Computers,DC=testserver,DC=local"
    },
    "labels": {
        "identity_source": "activedirectory-1"
    }
}
```


### Configuration

Example configuration:
```yaml
filebeat.inputs:
- type: entity-analytics
  enabled: true
  id: activedirectory-1
  provider: activedirectory
  dataset: "all"
  sync_interval: "12h"
  update_interval: "30m"
  ad_url: "ldaps://host.domain.tld"
  ad_base_dn: "CN=Users,DC=SERVER,DC=DOMAIN"
  ad_user: "USERNAME"
  ad_password: "PASSWORD"
```

The `activedirectory` provider supports the following configuration:

#### `ad_url`

The Active Directory server URL. Field is required.

#### `ad_base_dn`

The Active Directory Base Distinguished Name. Field is required.

#### `ad_user`

The client user name. Used for authentication. The user must have Active Directory read access. Field is required.

#### `user_attributes`

The set of directory attributes to request from the Active Directory server when collecting user data. If not set, all user attributes are requested. If set, only listed attributes are requested, including `distinguishedDomain` and `whenChanged`. Note that the Active Directory attribute names are used.

#### `group_attributes`

The set of directory attributes to request from the Active Directory server when collecting group data. If not set, all group attributes are requested. If set, only listed attributes are requested, including `distinguishedDomain` and `whenChanged`. Note that the Active Directory attribute names are used.

#### `ad_paging_size`

The number of records to request from the Active Directory server for each page, if set.

#### `ad_password`

The client’s password, used for authentication. Field is required.

#### `dataset`

The datasets to collect from Active Directory. This can be one of "all", "users" or "devices", or may be left empty for the default behavior, which is to collect all entities. When the `dataset` is set to "devices", some user entity data is collected in order to populate the registered users and registered owner fields for each device.

#### `sync_interval`

The interval in which full synchronizations should occur. The interval must be longer than the update interval (`update_interval`) Expressed as a duration string (e.g., 1m, 3h, 24h). Defaults to `24h` (24 hours).

#### `update_interval`

The interval in which incremental updates should occur. The interval must be shorter than the full synchronization interval (`sync_interval`). Expressed as a duration string (e.g., 1m, 3h, 24h). Defaults to `15m` (15 minutes).

## Azure Active Directory (`azure-ad`)

The `azure-ad` provider allows the input to retrieve users, with group memberships, from Azure Active Directory (AD).

### Setup

The necessary API permissions need to be granted in Azure in order for the provider to function properly:

| Permission           | Type        |
|----------------------|-------------|
| GroupMember.Read.All | Application |
| User.Read.All        | Application |
| Device.Read.All      | Application |

For a full guide on how to set up the necessary App Registration, permission granting, and secret configuration, follow this [guide](https://learn.microsoft.com/en-us/graph/auth-v2-service).

### How It Works


#### Overview

The Azure AD provider periodically:
- Contacts Azure Active Directory, retrieving updates for users, devices and groups.
- Updates its internal cache of user and device metadata and group membership information.
- Ships updated user metadata to Elasticsearch.

Fetching and shipping updates occurs in one of two processes: **full synchronizations** and **incremental updates**. Full synchronizations will send the entire list of users and devices in state, along with write markers to indicate the start and end of the synchronization event. Incremental updates will only send data for changed users and devices during that event. Changes on a user or device can come in many forms, whether it be a change to the user or device metadata, a user/device was added or deleted, or group membership was changed (either direct or transitive).

#### API Interactions

The provider periodically retrieves changes to user, device and group metadata from the Microsoft Graph API for Azure Active Directory. This is done through calls to three API endpoints:
- [/users/delta](https://learn.microsoft.com/en-us/graph/api/user-delta?view=graph-rest-1.0&tabs=http)
- [/devices/delta](https://learn.microsoft.com/en-us/graph/api/device-delta?view=graph-rest-1.0&tabs=http)
- [/groups/delta](https://learn.microsoft.com/en-us/graph/api/group-delta?view=graph-rest-1.0&tabs=http)

The `/delta` endpoint will provide changes that have occurred since the last call, with state being tracked through a delta token. If the /delta endpoint is called without a delta token, it will provide a full listing of users, devices or groups, similar to the non-delta endpoint. Since many results may be returned, there is a paging mechanism that is used. In the response body, there are two fields that may appear, `@odata.nextLink` and `@odata.deltaLink`.
- If a `@odata.nextLink` is returned, then there are more results to fetch, and the value of this field will contain the URL which should be immediately fetched.
- If a `@odata.deltaLink` is returned, then there are currently no more results, and the value of this field (a URL) should be saved for the next time updates need to be fetched (the delta token).

The group metadata will be used to enrich users and devices with group membership information. Direct memberships, along with transitive memberships, will be provided for users and devices.

#### Sending User and Device Metadata to Elasticsearch

During a full synchronization, all users and devices stored in state will be sent to the output, while incremental updates will only send users which have been updated. Full synchronizations will be bounded on either side by write marker documents, which will look something like this:
```json
{
    "@timestamp": "2022-11-04T09:57:19.786056-05:00",
    "event": {
        "action": "started",
        "start": "2022-11-04T09:57:19.786056-05:00"
    },
    "labels": {
        "identity_source": "azure-1"
    }
}
```

User documents will show the current state of the user.
Example user document:
```json
{
    "@timestamp": "2022-11-04T09:57:19.786056-05:00",
    "event": {
        "action": "user-discovered",
    },
    "azure_ad": {
        "userPrincipalName": "example.user@example.com",
        "mail": "example.user@example.com",
        "displayName": "Example User",
        "givenName": "Example",
        "surname": "User",
        "jobTitle": "Software Engineer",
        "mobilePhone": "123-555-1000",
        "businessPhones": ["123-555-0122"]
    },
    "user": {
        "id": "5ebc6a0f-05b7-4f42-9c8a-682bbc75d0fc",
        "group": [
            {
                "id": "331676df-b8fd-4492-82ed-02b927f8dd80",
                "name": "group1"
            },
            {
                "id": "d140978f-d641-4f01-802f-4ecc1acf8935",
                "name": "group2"
            }
        ]
    },
    "labels": {
        "identity_source": "azure-1"
    }
}
```

Device documents will show the current state of the device.
Example device document:
```json
{
    "@timestamp": "2022-11-04T09:57:19.786056-05:00",
    "event": {
        "action": "device-discovered",
    },
    "azure_ad": {
        "accountEnabled": true,
        "deviceId": "2fbbb8f9-ff67-4a21-b867-a344d18a4198",
        "displayName": "DESKTOP-LETW452G",
        "operatingSystem": "Windows",
        "operatingSystemVersion": "10.0.19043.1337",
        "physicalIds": {
            "extensionAttributes": {
                "extensionAttribute1": "BYOD-Device"
            }
        },
        "alternativeSecurityIds": [
            {
                "type": 2,
                "identityProvider": null,
                "key": "DGFSGHSGGTH345A...35DSFH0A"
            },
        ]
    },
    "device": {
        "id": "adbbe40a-0627-4328-89f1-88cac84dbc7f",
        "group": [
            {
                "id": "331676df-b8fd-4492-82ed-02b927f8dd80",
                "name": "group1"
            }
        ]
        "registered_owners": [
            {
                "id": "5ebc6a0f-05b7-4f42-9c8a-682bbc75d0fc",
                "userPrincipalName": "example.user@example.com",
                "mail": "example.user@example.com",
                "displayName": "Example User",
                "givenName": "Example",
                "surname": "User",
                "jobTitle": "Software Engineer",
                "mobilePhone": "123-555-1000",
                "businessPhones": ["123-555-0122"]
            },
        ],
        "registered_users": [
            {
                "id": "5ebc6a0f-05b7-4f42-9c8a-682bbc75d0fc",
                "userPrincipalName": "example.user@example.com",
                "mail": "example.user@example.com",
                "displayName": "Example User",
                "givenName": "Example",
                "surname": "User",
                "jobTitle": "Software Engineer",
                "mobilePhone": "123-555-1000",
                "businessPhones": ["123-555-0122"]
            },
        ],
    },
    "labels": {
        "identity_source": "azure-1"
    }
}
```


### Configuration

Example configuration:
```yaml
filebeat.inputs:
- type: entity-analytics
  enabled: true
  id: azure-1
  provider: azure-ad
  dataset: "all"
  sync_interval: "12h"
  update_interval: "30m"
  client_id: "CLIENT_ID"
  tenant_id: "TENANT_ID"
  secret: "SECRET"
  expand: 
    users:
      manager:
        - displayName
        - id
      directReports:
        - id
```

The `azure-ad` provider supports the following configuration:

#### `tenant_id`

The Tenant ID. Field is required.

#### `client_id`

The client/application ID. Used for authentication. Field is required.

#### `secret`

The secret value, used for authentication. Field is required.

#### `dataset`

The datasets to collect from the API. This can be one of "all", "users" or "devices", or may be left empty for the default behavior which is to collect all entities. When the `dataset` is set to "devices", some user entity data is collected in order to populate the registered users and registered owner fields for each device.

#### `sync_interval`

The interval in which full synchronizations should occur. The interval must be longer than the update interval (`update_interval`) Expressed as a duration string (e.g., 1m, 3h, 24h). Defaults to `24h` (24 hours).

#### `update_interval`

The interval in which incremental updates should occur. The interval must be shorter than the full synchronization interval (`sync_interval`). Expressed as a duration string (e.g., 1m, 3h, 24h). Defaults to `15m` (15 minutes).

#### `login_endpoint`

Override the default authentication login endpoint. Only change if directed to do so. Altering this value will also require a change to `login_scopes`.

#### `login_scopes`

Override the default authentication scopes. Only change if directed to do so.

#### `select.users`

Override the default [user query selections](https://learn.microsoft.com/en-us/graph/api/user-get?view=graph-rest-1.0&tabs=http#optional-query-parameters). This is a list of optional query parameters. The default is `["accountEnabled", "userPrincipalName", "mail", "displayName", "givenName", "surname", "jobTitle", "officeLocation", "mobilePhone", "businessPhones"]`.

#### `select.groups`

Override the default [group query selections](https://learn.microsoft.com/en-us/graph/api/group-get?view=graph-rest-1.0&tabs=http#optional-query-parameters). This is a list of optional query parameters. The default is `["displayName", "members"]`.

#### `select.devices`

Override the default [device query selections](https://learn.microsoft.com/en-us/graph/api/device-get?view=graph-rest-1.0&tabs=http#optional-query-parameters). This is a list of optional query parameters. The default is `["accountEnabled", "deviceId", "displayName", "operatingSystem", "operatingSystemVersion", "physicalIds", "extensionAttributes", "alternativeSecurityIds"]`.

#### `expand.users`

<applies-to>
  - Elastic Stack: Preview since 9.1
</applies-to>

Add [user query relationship expansions](https://learn.microsoft.com/en-us/graph/api/resources/user?view=graph-rest-1.0#relationships). This is a map of relationship names to attribute lists. By default this is not set. If an empty relationship list is given, the relationship expansion is the same as the users query.

#### `expand.groups`

<applies-to>
  - Elastic Stack: Preview since 9.1
</applies-to>

Add [group query relationship expansions](https://learn.microsoft.com/en-us/graph/api/resources/group?view=graph-rest-1.0#relationships). This is a map of relationship names to attribute lists. By default this is not set. If an empty relationship list is given, the relationship expansion is the same as the groups query.

#### `expand.devices`

<applies-to>
  - Elastic Stack: Preview since 9.1
</applies-to>

Add [device query relationship expansions](https://learn.microsoft.com/en-us/graph/api/resources/device?view=graph-rest-1.0#relationships). This is a map of relationship names to attribute lists. By default this is not set. If an empty relationship list is given, the relationship expansion is the same as the devices query.

### `tracer.enabled`

It is possible to log HTTP requests and responses to the EntraID API to a local file-system for debugging configurations. This option is enabled by setting `tracer.enabled` to true and setting the `tracer.filename` value. Additional options are available to tune log rotation behavior. To delete existing logs, set `tracer.enabled` to false without unsetting the filename option.
Enabling this option compromises security and should only be used for debugging.

### `tracer.filename`

To differentiate the trace files generated from different input instances, a placeholder `*` can be added to the filename and will be replaced with the input instance id. For Example, `http-request-trace-*.ndjson`. The path must point to a target in the azure-ad directory in the [Filebeat logs directory](https://www.elastic.co/docs/reference/beats/filebeat/directory-layout).

## Jamf Computer Management (`jamf`)

The `jamf` provider allows the input to retrieve computer records from the Jamf API.

### How It Works


#### Overview

The Jamf provider periodically:
- Contacts the Jamf API, retrieving updates for computers.
- Updates its internal cache of managed computer metadata.
- Ships updated metadata to Elasticsearch.

Fetching and shipping updates occurs in one of two processes: **full synchronizations** and **incremental updates**. Full synchronizations will send the entire list of computers in state, along with write markers to indicate the start and end of the synchronization event. Incremental updates will only send data for changed computers records during that event. Changes on a user or device can come in many forms, whether it be a change to the user’s metadata, or a user was added or deleted.

#### API Interactions

The provider periodically retrieves changes to user/device metadata from the Jamf computers-preview API. This is done through calls to:
- [/api/preview/computers](https://developer.jamf.com/jamf-pro/reference/get_preview-computers)

Updates are tracked by the provider by retaining a record of the time of the last noted update in the returned user list. During provider updates the Jamf provider makes use of the Jamf API’s query filtering to only request records updated at or since the provider’s recorded last update.

#### Sending Computer Metadata to Elasticsearch

During a full synchronization, all users/devices stored in state will be sent to the output, while incremental updates will only send users and devices that have been updated. Full synchronizations will be bounded on either side by write marker documents, which will look something like this:
```json
{
    "@timestamp": "2022-11-04T09:57:19.786056-05:00",
    "event": {
        "action": "started",
        "start": "2022-11-04T09:57:19.786056-05:00"
    },
    "labels": {
        "identity_source": "jamf-1"
    }
}
```

Documents will show the current state of the computer record.
Example document:
```json
{
    "device": {
        "id": "5982CE36-4526-580B-B4B9-ECC6782535BC"
    },
    "event": {
        "action": "device-discovered"
    },
    "jamf": {
        "location": {
            "username": "john.doe",
            "position": "Unknown Developer"
        },
        "site": null,
        "name": "acme-C07DM3AZQ6NV",
        "udid": "5982CE36-4526-580B-B4B9-ECC6782535BC",
        "serialNumber": "C07DM3AZQ6NV",
        "operatingSystemVersion": "14.0",
        "operatingSystemBuild": "23A344",
        "operatingSystemSupplementalBuildVersion": null,
        "operatingSystemRapidSecurityResponse": null,
        "macAddress": "64:0B:D7:AA:E4:B2",
        "assetTag": null,
        "modelIdentifier": "Macmini9,1",
        "mdmAccessRights": 0,
        "lastContactDate": "2024-04-18T14:26:51.514Z",
        "lastReportDate": "2024-06-19T15:54:37.692Z",
        "lastEnrolledDate": "2023-02-22T10:46:17.199Z",
        "ipAddress": null,
        "managementId": "1a59c510-b3a9-41cb-8afa-3d4187ac60d0",
        "isManaged": true
    },
    "labels": {
        "identity_source": "jamf-1"
    }
}
```


### Configuration

Example configuration:
```yaml
filebeat.inputs:
- type: entity-analytics
  enabled: true
  id: jamf-1
  provider: jamf
  dataset: "all"
  sync_interval: "12h"
  update_interval: "30m"
  jamf_tenant: "JAMF_TENANT"
  jamf_username: "JAMF_USERNAME"
  jamf_password: "JAMF_PASSWORD"
```

The `jamf` provider supports the following configuration:

#### `jamf_tenant`

The Jamf tenant host. Field is required.

#### `jamf_username`

The Jamf username, used for authentication. Field is required.

#### `jamf_password`

The Jamf user password, used for authentication. Field is required.

#### `page_size`

The number of computer records to collect with each API request. Defaults to [API default](https://developer.jamf.com/jamf-pro/reference/get_preview-computers).

#### `sync_interval`

The interval in which full synchronizations should occur. The interval must be longer than the update interval (`update_interval`) Expressed as a duration string (e.g., 1m, 3h, 24h). Defaults to `24h` (24 hours).

#### `update_interval`

The interval in which incremental updates should occur. The interval must be shorter than the full synchronization interval (`sync_interval`). Expressed as a duration string (e.g., 1m, 3h, 24h). Defaults to `15m` (15 minutes).

### `tracer.enabled`

It is possible to log HTTP requests and responses to the Jamf API to a local file-system for debugging configurations. This option is enabled by setting `tracer.enabled` to true and setting the `tracer.filename` value. Additional options are available to tune log rotation behavior. To delete existing logs, set `tracer.enabled` to false without unsetting the filename option.
Enabling this option compromises security and should only be used for debugging.

### `tracer.filename`

To differentiate the trace files generated from different input instances, a placeholder `*` can be added to the filename and will be replaced with the input instance id. For Example, `http-request-trace-*.ndjson`. The path must point to a target in the jamf directory in the [Filebeat logs directory](https://www.elastic.co/docs/reference/beats/filebeat/directory-layout).

## Okta User Identities (`okta`)

The Okta provider allows the input to retrieve users and devices from the Okta user API.

### Setup

The Okta provider supports two authentication methods:

#### API Token Authentication (Traditional)

In the administration dashboard for your Okta account, navigate to Security>API and in the Tokens tab click the "Create token" button to create a new token. Copy the token value and retain this to configure the provider. Note that the token will not be presented again, so it must be copied now. This value will use given to the provider via the `okta_token` configuration field.

#### OAuth2 Authentication (Recommended)

<applies-to>
  - Elastic Stack: Generally available since 9.2
</applies-to>

For enhanced security, the provider supports OAuth2 authentication using two methods:

##### JWT-Based Authentication

This method uses a private key to sign JWTs for authentication:
1. Create an OAuth2 application in your Okta admin console
2. Configure the application with the required scopes:
   - `okta.users.read`: Read user information
- `okta.devices.read`: Read device information (if collecting devices information is enabled)
3. Generate a private key (RSA) for the application
4. Register the corresponding public key with Okta
5. Configure the provider with the private key


##### Client Secret Authentication

This method uses a client secret for authentication:
1. Create an OAuth2 application in your Okta admin console
2. Configure the application with the required scopes:
   - `okta.users.read`: Read user information
- `okta.devices.read`: Read device information (if collecting devices information is enabled)
3. Note the client secret from the application configuration
4. Configure the provider with the client secret

This authentication method can also be used for OIN (Okta Integration Network) applications, where the client secret is provided as part of the OIN integration setup.
The necessary API permissions need to be granted in Okta in order for the provider to function properly. Devices API access needs to be activated by Okta support.

### How It Works


#### Overview

The Okta provider periodically:
- Contacts the Okta API, retrieving updates for users and devices.
- Updates its internal cache of user metadata.
- Ships updated user/device metadata to Elasticsearch.

Fetching and shipping updates occurs in one of two processes: **full synchronizations** and **incremental updates**. Full synchronizations will send the entire list of users and devices in state, along with write markers to indicate the start and end of the synchronization event. Incremental updates will only send data for changed users and devices during that event. Changes on a user or device can come in many forms, whether it be a change to the user’s metadata, or a user was added or deleted.

#### API Interactions

The provider periodically retrieves changes to user/device metadata from the Okta User and Device APIs. This is done through calls to:
- [/api/v1/users](https://developer.okta.com/docs/reference/api/users/#list-users)
- [/api/v1/devices](https://developer.okta.com/docs/api/openapi/okta-management/management/tag/Device/#tag/Device/operation/listDevices)

Updates are tracked by the provider by retaining a record of the time of the last noted update in the returned user list. During provider updates the Okta provider makes use of the Okta API’s query filtering to only request records updated at or since the provider’s recorded last update.

#### Sending User Metadata to Elasticsearch

During a full synchronization, all users/devices stored in state will be sent to the output, while incremental updates will only send users and devices that have been updated. Full synchronizations will be bounded on either side by write marker documents, which will look something like this:
```json
{
    "@timestamp": "2022-11-04T09:57:19.786056-05:00",
    "event": {
        "action": "started",
        "start": "2022-11-04T09:57:19.786056-05:00"
    },
    "labels": {
        "identity_source": "okta-1"
    }
}
```

User documents will show the current state of the user.
Example user document:
```json
{
    "@timestamp": "2023-07-04T09:57:19.786056-05:00",
    "event": {
        "action": "user-discovered",
    },
    "okta": {
        "id": "userid",
        "status": "RECOVERY",
        "created": "2023-06-02T09:33:00.189752+09:30",
        "activated": "0001-01-01T00:00:00Z",
        "statusChanged": "2023-06-02T09:33:00.189752+09:30",
        "lastLogin": "2023-06-02T09:33:00.189752+09:30",
        "lastUpdated": "2023-06-02T09:33:00.189753+09:30",
        "passwordChanged": "2023-06-02T09:33:00.189753+09:30",
        "type": {
            "id": "typeid"
        },
        "profile": {
            "login": "name.surname@example.com",
            "email": "name.surname@example.com",
            "firstName": "name",
            "lastName": "surname"
        },
        "credentials": {
            "password": {},
            "provider": {
                "type": "OKTA",
                "name": "OKTA"
            }
        },
        "_links": {
            "self": {
                "href": "https://localhost/api/v1/users/userid"
            }
        }
    },
    "user": {
        "id": "userid",
    },
    "labels": {
        "identity_source": "okta-1"
    }
}
```

Device documents will show the current state of the device, including any associated users.
Example device document:
```json
{
    "@timestamp": "2023-07-04T09:57:19.786056-05:00",
    "event": {
        "action": "device-discovered",
    },
    "okta": {
        "created": "2019-10-02T18:03:07Z",
        "id": "deviceid",
        "lastUpdated": "2019-10-02T18:03:07Z",
        "profile": {
            "diskEncryptionType": "ALL_INTERNAL_VOLUMES",
            "displayName": "Example Device name 1",
            "platform": "WINDOWS",
            "registered": true,
            "secureHardwarePresent": false,
            "serialNumber": "XXDDRFCFRGF3M8MD6D",
            "sid": "S-1-11-111"
        },
        "resourceAlternateID": "",
        "resourceDisplayName": {
            "sensitive": false,
            "value": "Example Device name 1"
        },
        "resourceID": "deviceid",
        "resourceType": "UDDevice",
        "status": "ACTIVE",
        "_links": {
            "activate": {
                "hints": {
                    "allow": [
                        "POST"
                    ]
                },
                "href": "https://localhost/api/v1/devices/deviceid/lifecycle/activate"
            },
            "self": {
                "hints": {
                    "allow": [
                        "GET",
                        "PATCH",
                        "PUT"
                    ]
                },
                "href": "https://localhost/api/v1/devices/deviceid"
            },
            "users": {
                "hints": {
                    "allow": [
                        "GET"
                    ]
                },
                "href": "https://localhost/api/v1/devices/deviceid/users"
            }
        },
        "users": [
            {
                "id": "userid",
                "status": "RECOVERY",
                "created": "2023-05-14T13:37:20Z",
                "activated": "0001-01-01T00:00:00Z",
                "statusChanged": "2023-05-15T01:50:30Z",
                "lastLogin": "2023-05-15T01:59:20Z",
                "lastUpdated": "2023-05-15T01:50:32Z",
                "passwordChanged": "2023-05-15T01:50:32Z",
                "type": {
                    "id": "typeid"
                },
                "profile": {
                    "login": "name.surname@example.com",
                    "email": "name.surname@example.com",
                    "firstName": "name",
                    "lastName": "surname"
                },
                "credentials": {
                    "password": {},
                    "provider": {
                        "type": "OKTA",
                        "name": "OKTA"
                    }
                },
                "_links": {
                    "self": {
                        "href": "https://localhost/api/v1/users/userid"
                    }
                }
            }
        ]
    },
    "device": {
        "id": "deviceid",
    },
    "labels": {
        "identity_source": "okta-1"
    }
}
```


### Configuration

Example configuration with API token authentication:
```yaml
filebeat.inputs:
- type: entity-analytics
  enabled: true
  id: okta-1
  provider: okta
  dataset: "all"
  enrich_with: ["groups", "roles"]
  sync_interval: "12h"
  update_interval: "30m"
  okta_domain: "your-domain.okta.com"
  okta_token: "your-okta-token"
```

Example configuration with OAuth2 JWT-based authentication:
```yaml
filebeat.inputs:
- type: entity-analytics
  enabled: true
  id: okta-1
  provider: okta
  dataset: "all"
  enrich_with: ["groups", "roles"]
  sync_interval: "12h"
  update_interval: "30m"
  okta_domain: "your-domain.okta.com"
  oauth2: 
    enabled: true
    client.id: "your-client-id"
    scopes: ["okta.users.read", "okta.devices.read"]
    token_url: "https://your-domain.okta.com/oauth2/v1/token"
    jwk_file: "/path/to/private-key.jwk"
```

Example configuration with OAuth2 client secret authentication:
```yaml
filebeat.inputs:
- type: entity-analytics
  enabled: true
  id: okta-1
  provider: okta
  dataset: "all"
  enrich_with: ["groups", "roles"]
  sync_interval: "12h"
  update_interval: "30m"
  okta_domain: "your-domain.okta.com"
  oauth2: 
    enabled: true
    client.id: "your-client-id"
    client.secret: "your-client-secret"
    scopes: ["okta.users.read", "okta.devices.read"]
    token_url: "https://your-domain.okta.com/oauth2/v1/token"
```

The `okta` provider supports the following configuration:

#### `okta_domain`

The Okta domain. Field is required.

#### `okta_token`

The Okta secret token, used for authentication. Field is required when using API token authentication.

#### `oauth2`

<applies-to>
  - Elastic Stack: Generally available since 9.2
</applies-to>

OAuth2 configuration for enhanced security authentication. When configured, OAuth2 authentication takes precedence over API token authentication.

##### `oauth2.enabled`

<applies-to>
  - Elastic Stack: Generally available since 9.2
</applies-to>

Enable OAuth2 authentication. Defaults to true if the oauth2 block is present.

##### `oauth2.client.id`

<applies-to>
  - Elastic Stack: Generally available since 9.2
</applies-to>

The OAuth2 client ID from your Okta application.

##### `oauth2.client.secret`

<applies-to>
  - Elastic Stack: Generally available since 9.2
</applies-to>

The OAuth2 client secret from your Okta application.

##### `oauth2.scopes`

<applies-to>
  - Elastic Stack: Generally available since 9.2
</applies-to>

List of OAuth2 scopes required for the application. Common scopes include:
- `okta.users.read`: Read user information
- `okta.devices.read`: Read devices information (if collecting devices information is enabled in `dataset` option)


##### `oauth2.token_url`

<applies-to>
  - Elastic Stack: Generally available since 9.2
</applies-to>

The OAuth2 token endpoint URL. Typically `https://your-domain.okta.com/oauth2/v1/token`.

##### `oauth2.jwk_file`

<applies-to>
  - Elastic Stack: Generally available since 9.2
</applies-to>

Path to the JWK file containing the private key.

##### `oauth2.jwk_json`

<applies-to>
  - Elastic Stack: Generally available since 9.2
</applies-to>

JWK JSON content containing the private key.

##### `oauth2.jwk_pem`

<applies-to>
  - Elastic Stack: Generally available since 9.2
</applies-to>

PEM-formatted private key content.
<note>
  Only one of `oauth2.jwk_file`, `oauth2.jwk_json`, or `oauth2.jwk_pem` must be provided for JWT authentication, or `oauth2.client.secret` must be provided for client secret authentication. The authentication method is automatically determined based on which credentials are provided.
</note>


#### `collect_device_details`

Whether the input should collect device and device-associated user details from the Okta API. Device details must be activated on the Okta account for this option.

#### `dataset`

The datasets to collect from the API. This can be one of "all", "users" or "devices", or may be left empty for the default behavior which is to collect all entities. When the `dataset` is set to "devices", some user entity data is collected in order to populate the registered users and registered owner fields for each device.

#### `enrich_with`

The metadata to enrich users with. This is an array of values that may contain "groups", "roles" and "factors", or "none". If the array only contains "none", no metadata is collected for users. The default behavior is to collect "groups".

#### `sync_interval`

The interval in which full synchronizations should occur. The interval must be longer than the update interval (`update_interval`) Expressed as a duration string (e.g., 1m, 3h, 24h). Defaults to `24h` (24 hours).

#### `update_interval`

The interval in which incremental updates should occur. The interval must be shorter than the full synchronization interval (`sync_interval`). Expressed as a duration string (e.g., 1m, 3h, 24h). Defaults to `15m` (15 minutes).

#### `limit_window`

The time between Okta API rate limit resets. Expressed as a duration string (e.g., 1m, 3h, 24h). Defaults to `1m` (1 minute).

#### `batch_size`

<applies-to>
  - Elastic Stack: Preview since 9.0
</applies-to>

The pagination batch size for requests. If it is zero or negative, the API default is used. The default is 200.

#### `limit_fixed`

The number of requests to allow in each limit window, if set. This parameter should only be set in exceptional cases. When it is set, rate limit information in API responses will be ignored in favor of the fixed limit. The limit is applied separately to each endopint. Defaults to unset.

#### `tracer.enabled`

It is possible to log HTTP requests and responses to the Okta API to a local file-system for debugging configurations. This option is enabled by setting `tracer.enabled` to true and setting the `tracer.filename` value. Additional options are available to tune log rotation behavior. To delete existing logs, set `tracer.enabled` to false without unsetting the filename option.
Enabling this option compromises security and should only be used for debugging.

#### `tracer.filename`

To differentiate the trace files generated from different input instances, a placeholder `*` can be added to the filename and will be replaced with the input instance id. For Example, `http-request-trace-*.ndjson`. The path must point to a target in the okta directory in the [Filebeat logs directory](https://www.elastic.co/docs/reference/beats/filebeat/directory-layout).

### `tracer.maxsize`

This value sets the maximum size, in megabytes, the log file will reach before it is rotated. By default logs are allowed to reach 1MB before rotation. Individual request/response bodies will be truncated to 10% of this size.

### Metrics

This input exposes metrics under the [HTTP monitoring endpoint](https://www.elastic.co/docs/reference/beats/filebeat/http-endpoint). These metrics are exposed under the `/inputs` path. They can be used to observe the activity of the input.

| Metric                   | Description                                                                                                        |
|--------------------------|--------------------------------------------------------------------------------------------------------------------|
| `sync_total`             | The total number of full synchronizations.                                                                         |
| `sync_error`             | The number of full synchronizations that failed due to an error.                                                   |
| `sync_processing_time`   | Histogram of the elapsed full synchronizations times in nanoseconds (time of API contact to items sent to output). |
| `update_total`           | The total number of incremental updates.                                                                           |
| `update_error`           | The number of incremental updates that failed due to an error.                                                     |
| `update_processing_time` | Histogram of the elapsed incremental updates times in nanoseconds (time of API contact to items sent to output).   |

<note>
  This input is experimental and is under active developement. Configuration options and behaviors may change without warning. Use with caution and do not use in production environments.
</note>