Tech Topics

Introducing: The cat API

Background

Perhaps the biggest success of Elasticsearch is its APIs. Interacting with the system is so simple it still catches me off guard, four years after I first tried it out. The engine powering this simplicity is JSON, a straightforward form of structured text birthed out of the rise of JavaScript. JSON is easy to understand and parse, and because of that, supported by almost every programming language in existence.

Humans, however, are not programming languages. JSON’s strength is that it’s plaintext, which makes it possible for our eyes to parse, but merely looking at human-readable characters isn’t the same as actually understanding the information. With any more than the most trivial structure in a JSON doc, we typically reach for the nearest pretty-printer. Unfortunately, pretty-printing often still does not translate into actionable knowledge. In fact, the addition of whitespace often makes life even more difficult as it eats up precious space in a terminal window.

“Not a problem,” you say. “JSON is so simple all I need is $LANG and five minutes.” Unfortunately, JSON and $LANG, along with speaking, walking, producing coherent sentences, and almost any other task in life, is a bit harder when you’re woken up from deep sleep by your phone alerting you to a system outage.

The 3 AM Page

Imagine this has just happened to Teresa. The monitoring system noticed that her cluster is red. A common first step in this moment of life as an Elasticsearch cluster administrator is to take a look at the logs on the master node. Which node is master? Armed with the comprehensive output of the cluster state API, she’s off to the races.

% curl 'es1:9200/_cluster/state?pretty'
{
  ...
  "master_node" : "Wjf_YVvySoK8TE41yORt3A",
  ...

OK, not quite. That’s just the node ID. Which node is that?

...
"nodes" : {
  "56RhV2ecT3OIZFzUYVYwNQ" : {
    "name" : "Midnight Sun",
    "transport_address" : "inet[/192.168.56.20:9300]",
    "attributes" : { }
  },
  "pyqzjh_nRx6rapL-CBvsyA" : {
    "name" : "Urthona",
    "transport_address" : "inet[/192.168.56.40:9300]",
    "attributes" : { }
  },
  "Wjf_YVvySoK8TE41yORt3A" : {
    "name" : "Lasher",
    "transport_address" : "inet[/192.168.56.10:9300]",
    "attributes" : { }
  },
...

Her tired eyes move back and forth a few times and figure out that in order to get to Wjf_YVvySoK8TE41yORt3A she must connect to 192.168.56.10.

She grumbles and pores over the logs. She notices that a node has failed some pings. She’s no fool. She’s seen this before and it wasn’t the network. The JVM is likely in trouble on that node.

First up, she checks on the cluster’s health.

% curl es1:9200/_cluster/health?pretty
{
  "cluster_name" : "foo",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 4,
  "number_of_data_nodes" : 4,
  "active_primary_shards" : 283,
  "active_shards" : 566,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 1
}

Uh oh! Red! Teresa is a bit paranoid, so she checks health across the whole cluster.

% for i in 1 2 3 4 5; do curl es${i}:9200/_cluster/health; echo; done
{"cluster_name":"foo","status":"red","timed_out":false,"number_of_nodes":4,"number_of_data_nodes":4,"active_primary_shards":283,"active_shards":566,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":1}
{"cluster_name":"foo","status":"red","timed_out":false,"number_of_nodes":4,"number_of_data_nodes":4,"active_primary_shards":283,"active_shards":566,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":1}

{"cluster_name":"foo","status":"red","timed_out":false,"number_of_nodes":4,"number_of_data_nodes":4,"active_primary_shards":283,"active_shards":566,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":1}
{"cluster_name":"foo","status":"red","timed_out":false,"number_of_nodes":4,"number_of_data_nodes":4,"active_primary_shards":283,"active_shards":566,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":1}

A bit verbose, but it got the job done. All the nodes agree; at least, the ones that are responding. A node definitely is missing, which confirms the ping failures, but that doesn’t explain the red cluster. She sits back nods off for ten minutes. When she wakes up, by chance she notices in the JSON soup splattered all over her screen that there’s an unassigned shard.

“Hm, which one is that?” she asks herself. Experienced with the APIs, she cleverly attaches the level parameter to /_cluster/health to dig deeper.

% curl es1:9200/_cluster/health?level=shards&pretty
    ...
    "foo-20140116" : {
      "status" : "red",
      "number_of_shards" : 2,
      "number_of_replicas" : 0,
      "active_primary_shards" : 1,
      "active_shards" : 1,
      "relocating_shards" : 0,
      "initializing_shards" : 0,
      "unassigned_shards" : 1,
      "shards" : {
        "0" : {
          "status" : "red",
          "primary_active" : false,
          "active_shards" : 0,
          "relocating_shards" : 0,
          "initializing_shards" : 0,
          "unassigned_shards" : 1
        },
    ...

Now she’s getting somewhere. The foo-20140116 index was created today by Logstash. For some reason shard 0 doesn’t have an active primary, which must have been on the 192.168.56.30 node that isn’t up at the moment. “What happened to the replicas?” she thinks.

Teresa starts the missing node, flips on a replica, and heads back to bed. She can figure that out in the morning.

A new kind of API

If it was as difficult to read that short tale as it was to write it, I apologize. Fortunately there is light at the end of the curly brace.

Let’s see what Teresa’s night would have looked like if she was able to work with some slightly different APIs.

The first thing she needed to do was find the master. It would have been nice if in one, single glorious line she got the node and host information.

% curl es1:9200/_cat/master
Wjf_YVvySoK8TE41yORt3A es1 192.168.56.10 Lasher 

Puurfect! Instead of messing with the logs, however, she really just needs to get a bird’s-eye view of the current node situation.

% curl es1:9200/_cat/nodes
es1 192.168.56.10 35 79 0.00 d * Lasher       
es2 192.168.56.20 40 88 0.00 d m Midnight Sun 
es4 192.168.56.40 49 89 0.00 d m Urthona      
es5 192.168.56.50 40 75 0.00 d m Chimera      

She immediately can tell that she’s missing a node, and which one! Next up is to figure out why the cluster is red. Is it thirty indices or only one?

% curl es1:9200/_cat/indices | grep ^red
red   foo-20140116 2 0 30620 1 78.6mb 78.6mb 

Looks like it’s only one. How many shards are missing?

% curl es1:9200/_cat/shards/foo-20140116
foo-20140116 0 p UNASSIGNED                                    
foo-20140116 1 p STARTED    30620 78.6mb 192.168.56.50 Chimera 

It’s easy to see now that half of the primaries are gone and there aren’t replicas configured for foo 20140116. After starting up es3, a cluster-wide health check:

% for i in 1 2 3 4 5; do ssh es${i} curl -s localhost:9200/_cat/health; done
1389940476 18:05:40 foo green 5 5 10 10 0 0 0 
1389940477 18:05:40 foo green 5 5 10 10 0 0 0 
1389940479 18:05:40 foo green 5 5 10 10 0 0 0 
1389940480 18:05:40 foo green 5 5 10 10 0 0 0 
1389940480 18:05:40 foo green 5 5 10 10 0 0 0 

Nice and succinct, where anomalies can easily be caught before precious minutes are wasted in data that doesn’t lead you to informed decisions.

Numbers Everywhere

This may not seem like an improvement to you. We’ve gone from the explicit, labeled JSON to columns of random numbers. To alleviate the transition headache, every cat endpoint takes a v parameter to turn on verbose mode. It will output a header row labeling each column.

% curl 'es1:9200/_cat/health?v'
epoch      timestamp cluster status nodeTotal nodeData shards pri relo init unassign 
1389963537 18:06:03  foo     green          5        5     10  10    0    0        0 

Headers

Now we can see that the numbers correspond directly to the key/value pairs that appear in the cluster health API. We can also use these headers to selectively output columns relative to our context. Suppose we’re tracking a long cluster recovery and we want to see our unassigned shards number precipitously drop. We could just output all the numbers. A cleaner approach would be to filter every thing except the number we care about.

% while true; do curl 'es1:9200/_cat/health?h=epoch,timestamp,cluster,status,unassign'; sleep 30; done
1389969492 06:38:12 foo yellow 262
1389969495 06:38:15 foo green 250
1389969498 06:38:18 foo green 237
...

Column management

One of the major motivations behind cat is to speak Unix fluently. In this case, a simple … | awk '{print $1, $2, $3, $4, $11}' would have sufficed, but some APIs have many more non-default columns that you can only get to with h.

Let’s say Teresa experienced some high heap usage on her nodes and she does a lot of sorting and faceting, common users of fielddata cache. She would like to compare fielddata cache usage and heap across nodes, a task that’s technically possible with the node stats API, but becomes impractical-to-impossible after two nodes. With the cat nodes API, it’s simply a matter of knowing a few column names.

% curl 'es1:9200/_cat/nodes?h=host,heapPercent,heapMax,fielddataMemory' | sort -rnk2
es4 61 29.9gb 14.4gb
es3 58 29.9gb 16.5gb
es5 40 29.9gb    5gb
es2 33 29.9gb  8.2gb
es1 20 29.9gb  3.4gb

Sorting by percentage of heap used makes it very clear there is some rough correlation between heap and fielddata use.

Byte and time resolution

What if she wanted to sort by fielddataMemory? ES provides human-readable conversions from bytes but this actually makes it harder for sort. She can supply the bytes flag to specify the unit of precision.

% curl 'es1:9200/_cat/nodes?h=host,heapPercent,heapMax,fielddataMemory&bytes=b' | sort -rnk4
es3 58 29.9gb 17805705171
es4 61 29.9gb 15550755044
es2 33 29.9gb  8880273008
es5 40 29.9gb  5449302354
es1 20 29.9gb  3687354160

The same kind of resolution calculation for time works as well. If she has a column like merges.total_time that she wants in seconds, she can supply a time parameter with ms.

% curl 'es1:9200/_cat/nodes?h=host,mtt&time=s' | sort -rnk2
es4 910
es3 902
es2 278
es1 190
es5  99

Help!

How did she know that mtt would give her merges.total_time? Each API supports a flag help with all the possible column headers.

% curl 'es1:9200/_cat/nodes?help' | fgrep merge
merges.current           | mc,mergesCurrent          | number of current merges                
merges.current_docs      | mcd,mergesCurrentDocs     | number of current merging docs          
merges.current_size      | mcs,mergesCurrentSize     | size of current merges                  
merges.total             | mt,mergesTotal            | number of completed merge ops           
merges.total_docs        | mtd,mergesTotalDocs       | docs merged                             
merges.total_size        | mts,mergesTotalSize       | size merged                             
merges.total_time        | mtt,mergesTotalTime       | time spent in merges                    

Any of merges.total_time, mtt, or mergesTotalTime would have worked. She picked mtt since it’s short and esoteric, like a good Unix admin prefers.

Much more!

health, nodes, master, and shards are just a few. There are APIs for indices, recovery progress, and more!

Conclusion

cat is an evolution of ad hoc tools produced in the field of large clusters running on early versions of Elasticsearch. It was clear that the APIs were a generation ahead. They were excellent for machines while they were merely usable for humans. cat aims to fit in those places where JSON doesn’t – the Unix pipe, the chat window, the tweet. In an era of unprecedented bandwidth of communication, it’s the low-bandwidth, the lightweight, that we reach for most often. It’s fitting that they also happen to be the places cats seem to appear most.

We would love to hear your feedback on the cat API. Let us know what you think!