ES|QL in JavaScript: Leveraging Apache Arrow helpers

Learn how to use ES|QL with JavaScript Apache Arrow client helpers to analyze large data sets efficiently.

Get hands-on with Elasticsearch: Dive into our sample notebooks in the Elasticsearch Labs repo, start a free cloud trial, or try Elastic on your local machine now.

Elasticsearch Query Language (ES|QL) is a new instruction language in pipes aimed at allowing users to link different operations in a step-by-step fashion. It’s a language optimized for data analysis, besides working in a new architecture designed to analyze large data volumes with high efficiency.

You can learn more about ES|QL in this article and the documentation.

ES|QL queries allow you to build the response in different formats, such as JSON, CSV, TSV, YAML, Arrow, and binary. Starting in Elasticsearch 8.16, the Node.js client includes helpers to handle some of these formats.

This article will cover the newest helpers, toArrowReader and toArrowTable, which support Apache Arrow specifically in the Elasticsearch Node.js client. For more on helpers, check out this article.

What is Apache Arrow?

Apache Arrow is a columnar data analysis tool that uses an agnostic format across the programming language of modern environments.

One of the primary benefits of the Arrow format is that its binary, columnar format is optimized for very fast reads, enabling high-performance analytics calculations.

Read more about how to leverage Arrow with ES|QL in this article.

ES|QL Apache Arrow helpers

For the examples, we are going to use Elastic’s Web logs sample dataset. You can ingest it by following this documentation.

Elasticsearch client

Set up the Elasticsearch client by specifying your Elasticsearch endpoint URL and API Key.

What is toArrowReader?

The toArrowReader helper is provided to optimize memory by not loading the entire result set into memory at once, but rather by streaming it in batches. This makes it possible to perform calculations on very large data sets without exhausting your system's memory.

This helper allows you to process each row:

How to use toArrowTable?

We can use toArrowTable if we want to load all the results into an Arrow table object once the request is completed, instead of returning each row as a stream.

This helper is useful if your dataset will easily fit in memory and you still want to leverage Arrow’s zero-copy reads and compact transfer size while keeping the code simple.

toArrowTable is also a good option if the application is already working with Arrow data, since you don’t need to serialize the data. In addition, given that Arrow is language-agnostic, you can use it regardless of the platform and language.

Conclusion

The Apache Arrow helpers provided by the Elasticsearch Node.js client help facilitate day-to-day tasks like analyzing large data sets efficiently and receiving Elasticsearch responses in a compact and language-agnostic format.

In this article, we learned how to use the ES|QL client helpers to parse the Elasticsearch response as an Arrow Reader or an Arrow Table.

よくあるご質問

What is Apache Arrow?

Apache arrow is a columnar data analysis tool that uses an agnostic format across the programming language.

What is the benefit of using an Arrow format?

A main benefit of the arrow format is that it uses a binary, columnar format that is optimized for very fast reads, enabling high-performance analytics calculations.

関連記事

最先端の検索体験を構築する準備はできましたか?

十分に高度な検索は 1 人の努力だけでは実現できません。Elasticsearch は、データ サイエンティスト、ML オペレーター、エンジニアなど、あなたと同じように検索に情熱を傾ける多くの人々によって支えられています。ぜひつながり、協力して、希望する結果が得られる魔法の検索エクスペリエンスを構築しましょう。

はじめましょう