Retrieving a Documentedit

To get the document out of Elasticsearch, we use the same _index, _type, and _id, but the HTTP verb changes to GET:

GET /website/blog/123?pretty

The response includes the by-now-familiar metadata elements, plus the _source field, which contains the original JSON document that we sent to Elasticsearch when we indexed it:

{
  "_index" :   "website",
  "_type" :    "blog",
  "_id" :      "123",
  "_version" : 1,
  "found" :    true,
  "_source" :  {
      "title": "My first blog entry",
      "text":  "Just trying this out...",
      "date":  "2014/01/01"
  }
}

Adding pretty to the query-string parameters for any request, as in the preceding example, causes Elasticsearch to pretty-print the JSON response to make it more readable. The _source field, however, isn’t pretty-printed. Instead we get back exactly the same JSON string that we passed in.

The response to the GET request includes {"found": true}. This confirms that the document was found. If we were to request a document that doesn’t exist, we would still get a JSON response, but found would be set to false.

Also, the HTTP response code would be 404 Not Found instead of 200 OK. We can see this by passing the -i argument to curl, which causes it to display the response headers:

curl -i -XGET http://localhost:9200/website/blog/124?pretty

The response now looks like this:

HTTP/1.1 404 Not Found
Content-Type: application/json; charset=UTF-8
Content-Length: 83

{
  "_index" : "website",
  "_type" :  "blog",
  "_id" :    "124",
  "found" :  false
}

Retrieving Part of a Documentedit

By default, a GET request will return the whole document, as stored in the _source field. But perhaps all you are interested in are the title and text fields. Individual fields can be requested by using the _source parameter. Multiple fields can be specified in a comma-separated list:

GET /website/blog/123?_source=title,text

The _source field now contains just the fields that we requested and has filtered out the date field:

{
  "_index" :   "website",
  "_type" :    "blog",
  "_id" :      "123",
  "_version" : 1,
  "found" :   true,
  "_source" : {
      "title": "My first blog entry" ,
      "text":  "Just trying this out..."
  }
}

Or if you want just the _source field without any metadata, you can use the _source endpoint:

GET /website/blog/123/_source

which returns just the following:

{
   "title": "My first blog entry",
   "text":  "Just trying this out...",
   "date":  "2014/01/01"
}