8.4.0 release notesedit

Upgrading to Enterprise Search 8.4.0? See Upgrading and migrating.

For our Docker image, the 8.4.0 release includes a JDK update that pertains to a potential security vulnerability. Please see our security statement for more details.

Breaking changesedit

  • The App Search web crawler now uses an ingest pipeline named app_search_crawler:

    • This replaces the pipeline named ent_search_crawler in 8.3.
    • As of 8.4.0 the ent_search_crawler pipeline is associated with the new Elastic web crawler.
    • If you modified this pipeline pre-8.4.0, you can create a new web crawler and maintain those changes in the ent_search_crawler pipeline. To continue using the App Search crawler, add these changes to the app_search_crawler pipeline. See the configuration documentation.
  • The App Search Elasticsearch search API (beta) is now more consistent with the Elasticsearch search API. The API now:

    • Supports sending analytics in request headers.
    • Supports passing query parameters through the URL instead of the request body.
    • Moves the request body to the query JSON attribute rather than wrapping it in request.

New featuresedit

  • Use the new Elastic web crawler (general availability) to transform webpage content, including binary files like PDFs, into searchable content:

    • The web crawler is Elasticsearch native. It writes directly to Enterprise Search managed search-* prefixed Elasticsearch indices, which can be used to create App Search engines.
    • The web crawler funnels both HTML and binary content into the ent_search_crawler defined ingest pipeline.
  • The Elasticsearch _search endpoint for App Search is now a beta feature, enabling additional Elasticsearch functions with App Search:

    • Vector querying functionality, specifically kNN vector similarity, is now available from the elasticsearch/_search endpoint.
    • A new hybrid retrieval method combines vector similarity with query scoring, so you can integrate vector search with your existing Elasticsearch scoring functions.
    • This endpoint supports using Elasticsearch credentials: Basic access authentication, JSON Web Tokens (JWT), and Elasticsearch bearer tokens.
  • Build a connector for your data source of choice using the connector framework (tech preview):

    • Connectors index content into search-optimized Elasticsearch indices, which can be used to create App Search engines.
    • Use the framework in this GitHub repo to customize example Ruby connectors, including MongoDB and GitLab. Users can also build their own custom connectors using their language of choice.
  • Use the improved content management overview to manage integrations and search-optimized indices. Elasticsearch indices created from the Enterprise Search content management area can be used to create App Search engines.
  • App Search support for Elasticsearch index-backed engines is now a beta feature, with the addition of several key features:

    • Object fields are supported in Elasticsearch indices.
    • Nested fields are partially supported in Elasticsearch indices. Support includes filtering and usage in result fields.
    • Meta engines can now combine multiple engines based on Elasticsearch indices.
    • Elasticsearch index based engines now support precision tuning for engines compatible with App Search text subfield conventions.
  • App Search web crawler support for binary content extraction is now generally available. Crawl binary content such as PDF and Office documents. Note that only binary content (not HTML) is sent into the app_search_crawler pipeline, whereas the Elastic web crawler sends all indexed documents into the ent_search_crawler pipeline. See the breaking changes section if you have modified the App Search web crawler ingest pipeline settings.

Bug fixesedit

  • Made Workplace Search’s Zendesk connector handle x-rate-limit headers in a case-insensitive manner, to address inconsistencies in the Zendesk API’s response headers.

Known issuesedit

  • Both the Elastic web crawler and App Search web crawler incorrectly append the page title to the beginning of the crawled body_content field— when the page title is only specified in the page header, not the page body. This issue, caused by an outstanding bug, can happen when a <noscript/> element is present in the <head> section of the page.

    Consider post-processing, for example by customizing the crawler’s ingest pipeline, to remove the superfluous title from the body_content field before presenting the crawled results to end users.

  • The Workplace Search GitHub connector misclassifies rate limiting as fatal unknown exceptions, causing syncs to fail instead of backing off. This is caused by a backwards incompatible change in the GitHub API following the introduction of "secondary rate limits".
  • By default the enterprise_search user does not have permissions to manage ingestion indices. Add read and manage index-level privileges to the enterprise_search user account or create a new account with the correct privileges.
  • The Elastic web crawler uses an incorrect Elasticsearch data stream for logging: logs-crawler-default. Deployments using both the Elastic web crawler and the App Search web crawler will use the same data stream for logging.
  • The Elastic web crawler does not currently respect a crawl schedule, when configured. Crawls must be manually triggered in the UI.