Automatic Import
Automatic Import parses, ingests, and maps data to ECS for sources that don’t yet have prebuilt Elastic integrations. It works with Elastic Security, Observability, and other solutions that rely on Elastic Agent and integrations. This lets you onboard custom or niche data sources without building a full integration manually.
Automatic Import uses a large language model (LLM) with specialized instructions to analyze source data and generate a custom integration.
Elastic integrations, including those created by Automatic Import, normalize data to the Elastic Common Schema (ECS). This standardization provides consistent use across dashboards, search, alerts, and machine learning features.
Refer to prebuilt data integrations for a full list of Elastic’s 400+ integrations.
Try an interactive demo of Automatic Import to preview the feature before you set it up in your environment.
- A working LLM connector.
- Elastic Stack users: An Enterprise subscription.
- Serverless Elastic Security Serverless projects: the Security Analytics Complete feature tier.
- Serverless Elastic Observability Serverless projects: the Observability Complete feature tier.
- A sample of the data you want to import.
To use Automatic Import, you must provide a sample of the data you want to import. An LLM processes that sample and creates an integration suitable for the data represented by the sample. Automatic Import supports the following sample formats: JSON, NDJSON, CSV, and syslog (structured and unstructured).
For API-based collection, Automatic Import can generate a program in Common Expression Language (CEL). For background, refer to the CEL specification{:target="_blank"} and the CEL input in Filebeat.
- You can upload a sample of any size. The LLM detects its format and selects up to 100 documents for detailed analysis.
- The more variety in your sample, the more accurate the pipeline is. For best results, include a wide range of unique log entries in your sample instead of repeating similar logs.
- When you upload a CSV, a header with column names is automatically recognized. If the header is not present, the LLM attempts to create descriptive field names based on field formats and values.
- For JSON and NDJSON samples, each object in your sample should represent an event. Avoid deeply nested object structures.
- When you select
API (CEL input)as one of the sources, you’re prompted to provide the associated OpenAPI specification (OAS) file to generate a CEL program that consumes this API.
CEL generation in Automatic Import is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features.
You can use Automatic Import with any LLM. Model performance varies. Model performance for Automatic Import is similar to model performance for Attack Discovery: models that perform well for Attack Discovery perform well for Automatic Import. Refer to the large language model performance matrix for Elastic Security. For Observability workloads, refer to the LLM performance matrix for Observability.
Using Automatic Import allows users to create new third-party data integrations through the use of third-party generative AI models (“GAI models”). Any third-party GAI models that you choose to use are owned and operated by their respective providers. Elastic does not own or control these third-party GAI models, nor does it influence their design, training, or data-handling practices. Using third-party GAI models with Elastic solutions, and using your data with third-party GAI models is at your discretion. Elastic bears no responsibility or liability for the content, operation, or use of these third-party GAI models, nor for any potential loss or damage arising from their use. Users are advised to exercise caution when using GAI models with personal, sensitive, or confidential information, as data submitted can be used to train the models or for other purposes. Elastic recommends familiarizing yourself with the development practices and terms of use of any third-party GAI models before use. You are responsible for ensuring that your use of Automatic Import complies with the terms and conditions of any third-party platform you connect with.
In Kibana, open Integrations. You can use the main menu, the global search field, or your solution’s entry point (for example, Add integrations in Elastic Security, or Add data in Observability).
Under Can’t find an integration? click Create new integration.
Click Create integration.
Select an LLM connector.
Define how your new integration will appear on the Integrations page by providing a Title, Description, and Logo. Click Next.
Define your integration’s package name, which will prefix the imported event fields.
Define your Data stream title, Data stream description, and Data stream name. These fields appear on the integration’s configuration page to help identify the data stream it writes to.
Select your Data collection method. This determines how your new integration ingests the data (for example, from an S3 bucket, an HTTP endpoint, or a file stream).
NoteIf you select API (CEL input) (Common Expression Language{:target="_blank"} via the CEL input in Filebeat), you have the additional option to upload the API’s OAS file here. After you do, the LLM uses it to determine which API endpoints (GET only), query parameters, and data structures to use in the new custom integration. You then select which API endpoints to consume and your authentication method before uploading your sample data.
Upload a sample of your data. Make sure to include all the types of events that you want the new integration to handle.
Click Analyze logs, then wait for processing to complete. This may take several minutes.
After processing is complete, the pipeline’s field mappings appear, including ECS and custom fields.
(Optional) After reviewing the proposed pipeline, you can fine-tune it by clicking Edit pipeline. Refer to the Elastic Security ECS reference to learn more about formatting field mappings. When you’re satisfied with your changes, click Save.
NoteIf your new integration collects data from an API, you can update the CEL input configuration (program and API authentication information) from the new integration’s integration policy.
Click Add to Elastic. After the Success message appears, your new integration will be available on the Integrations page.
Click Add to an agent to deploy your new integration and start collecting data, or click View integration to view detailed information about your new integration.
(Optional) Once you’ve added an integration, you can edit the ingest pipeline by going to the Ingest Pipelines page using the navigation menu or the global search field.
If you use Elastic Security, you can use the Data Quality dashboard to check the health of your data ingest pipelines and field mappings.