Presenting Play: A Preview of an Elasticsearch Playground

UPDATE: This article refers to our hosted Elasticsearch offering by an older name, Found. Please note that Found is now known as Elastic Cloud.

Introduction

Play is a tool that makes it easy to, well, play with Elasticsearch’s vast feature set. It’s inspired by tools like jsfiddle. jsfiddle lets you easily throw together HTML+CSS+Javascript demos. Play lets you do that for Elasticsearch: a combination of sample documents, text analyzers, mappings and searches constitute a runnable demo.

To develop the perfect search, you need to structure the documents correctly, configure the right text processing, wire up mappings and do some searches to see how well it works.

This is typically an iterative process. “Oh, it would be great if I could search like this. But that requires a change to that analyzer, and tweaking the mappings a bit, and [etc., etc.…]”

While changing mappings and analyzers for an existing index is a lot of work with plain Elasticsearch, Play makes it easy by doing those things for you. Change an analyzer configuration, run, and a split second later you can see how it affects your searches. It also assists you with configuring mappings and searches as well, with context aware autocompletion and documentation.

When you play, it creates the indexes with the right mappings and analyzers for you, indexes all your sample documents and then runs all the sample searches. It can do this very quickly because we impose one limitation: you cannot have a huge set of sample documents.

Note: Play is neither feature complete nor bug free. This is a preview to show where we’re headed, and to get some early feedback. Expect bugs!

Overview

Overview
Overview

Play’s user interface is structured into five editor tabs.

In addition to an Overview tab, there are supplementary tabs designed to optimize the screen estate for the task at hand. They are different views on the same data, and are kept in sync.

  • The Documents tab uses most of the screen estate on the editor for sample documents, with the rest spent on showing the results.
  • Analysis shows an editor for configuring analyzers, tokenizers and so on, with a view that shows how text is processed step by step.
  • Mappings uses most of the space for the mappings editor, with some space left for documentation for the type of mapping you’re doing, a small view that shows how text is processed for the currently selected field, and lastly, the resulting mapping.
  • Searches uses a lot of space for editing searches and for showing results. There’s also a view for showing documentation for the search/filter/facet/parameter your cursor is at.

That’s Not JSON!

Play prefers YAML over JSON. YAML is easier to read and edit (by humans), and you can comment it.

While JSON is valid YAML, to get the most out of Play (i.e. context-aware autocompletion and documentation), you should use the supported subset of YAML. This is documented in Play’s help section.

In Can I Run it on My Cluster? we describe how you can export things as JSON, though!

Your first Play

Before we look further into the various features of Play, let’s start out with a simple Hello World.

In this exercise we want to index three documents and run two searches.

  1. First, open Play in a separate tab or window. Click the “Clear”-button in the top right corner to clear out the introductory text.
  2. Paste this text in the “Documents” editor, i.e. the top left editor:
<span class="co"># This is the first document</span>
<span class="fu">quote:</span> Man had always assumed that he was more intelligent than dolphins because he had achieved so much - the wheel, New York, wars and so on - whilst all the dolphins had ever done was muck about in the water having a good time. But conversely, the dolphins had always believed that they were far more intelligent than man - for precisely the same reasons.
<span class="fu">author:</span> Douglas Adams
<span class="ot">---</span>
<span class="co"># This is the second document. The three dashes (---) separate them.</span>
<span class="fu">quote:</span> The ships hung in the sky in much the same way that bricks don't.
<span class="fu">author:</span> Douglas Adams
<span class="ot">---</span>
<span class="co"># And the third...</span>
<span class="fu">quote:</span> Winter is coming.
<span class="fu">author:</span> George R. R. Martin
  1. Paste this into the “Searches” editor, i.e. the top right editor:
<span class="co"># This is the first search</span>
<span class="fu">query:</span>
    <span class="fu">match:</span>
        <span class="fu">quote:</span> dolphins
<span class="ot">---</span>
<span class="co"># Second search.</span>
<span class="fu">facets:</span>
    <span class="fu">words:</span>
        <span class="fu">terms:</span> 
            <span class="fu">field:</span> author
  1. Click on “Run” in the top right menu, or press Ctrl+Enter to run the play.
  2. Results appear in the bottom right window. There’s one tab for the resulting mappings, and one for each search.

Autocompletion, Documentation and Linting

Context aware autocompletion and documentation
Context aware autocompletion and documentation

While Play is neither all-encompassing nor feature complete, we have spent a lot of time making sure we can add context aware autocompletion, documentation and linting.

The various editors know where your cursor is, and whether the cursor is in a filter in a query in a facet. They already suggest many things:

  • Most of the search structures, like query, filter and facet DSLs.
  • Fields and types - both when mapping, modifying sample documents and specifying them as parameters in queries, facets or filters.
  • Available analyzers, when configuring mappings and searches.

We will eventually teach Play to autocomplete everything, including suggesters, aggregations, etc.

Knowing the location and the context of the cursor is also used e.g. to highlight the results for the search you are currently working on, or show how existing text in your sample documents is currently being tokenized when editing mappings.

Example sanity checking – before it becomes a performance problem
Example sanity checking – before it becomes a performance problem

Furthermore, we want to be able to highlight warnings and errors as they happen. The knowledge base is not comprehensive for the time being, though. You will get a little warning if you try to do inefficient filters, such as when using a top level filter without any facets. Or when you want to be using a bool filter and not an and as explained in Zachary Tong’s article on filter bitsets.

Working with Text

As explained in our article on Elasticsearch from the bottom up, getting the text processing right is a very important part of working with search.

To make it easy to work with analyzers, tokenizers, token filters and so on, we’ve made an analysis editor that shows step by step how text is processed.

The image below shows how the input e.g. “John Smith” is first tokenized into [John, Smith] by the standard-tokenizer before each term is subsequently filtered by the double_metaphone-filter. Tokens are displayed such that overlapping tokens will be displayed as such. You can also hover them to see other overlapping tokens.

Sample text processing output
Sample text processing output

For a dive into analyzers, here’s a great read: All About Analyzers, Part One

How it Works

Here’s exactly what happens when you run/execute a Play: An API-request consisting of all sample documents, searches, analyzers and mappings is sent to a backend running on Found’s servers. This backend uses a pool of in-memory Elasticsearch servers, and for every request …

  1. Creates index templates with the correct analysis and mappings configurations.
  2. Indexes all the documents. If a document causes the creating of an index, the index is created after the template made in the previous step.
  3. Gets the resulting mapping, which is a combination of what Elasticsearch guesses and what’s specified by the user.
  4. Runs all the searches and gets the results.
  5. Deletes all the indexes.

This all happens in memory and usually takes just a few tenths of a second.

Can I Run it on My Cluster?

Not yet.

We are committed to open source Play, with the same license used by Elasticsearch, i.e. APLv2.

That said, the Play environment is not something you’d want to have running on a live cluster. The “create lots of indexes and then delete them” process causes lots of changes to the cluster state, and it will not be very fast when things need to hit disk.

However, the search view with mapping aware autocompletion and documentation is very useful on a live cluster. We intend to provide that functionality as a separate plugin that you can run on your cluster.

What you can do is export the resulting state of Play to a live cluster, using the save/export-pane. That’ll provide you with a shell-script you can run.

Exporting the state of Play to another cluster
Exporting the state of Play to another cluster

Limitations

There are some things you cannot do with Play, at least not at present:

  • Dynamic scripts are disabled.
  • The sum of all documents, searches, mappings and analyzers cannot exceed 2 MiB.
  • Shards and custom routing cannot be specified at this time.

Sharing

If you want to save your work, it will be saved as a Gist. You can chose whether it should be private (which is the default) or public. Public gists can be found by other users through the “Explore”-tab.

Play lets you authenticate with Github. Once signed in you can easily find your own private gists, as well as save any changes to your own gists. If you edit an existing gist you do not own, a fork will be made first.

Roadmap

We have many plans for Play. This is merely an initial preview release. There are known bugs, but with the amount of “When can I get to use it?!” feedback we have received, we wanted to make a preview available sooner rather than later.

As mentioned, Play will be available as an open source project. There’s some work ahead to decouple it from internal tools and some general cleanups that needs to happen before we can publish it.

Having said that, here’s a few things we plan to do:

  • Integrate the official documentation, both for the search, view and mapping views.
  • Add more autocompletion, such as suggesters, decay scoring, and the upcoming aggregation feature.
  • Make things commentable. We’re considering using gist’s commenting feature for this.
  • Enable viewing of a gist’s history and loading an old version right from Play.
  • Refactor the search view and turn it into an Elasticsearch plugin you can use on your own cluster.
  • Collaboration with TogetherJS

If you have any comments, feedback, thoughts, ideas or constructive criticism, don’t hesitate to reach out: tweet1(https://twitter.com/elastic) or mail play@found.no!