29 octobre 2015

F5 High Speed Logging with Elastic Stack

A load balancer often serves as a gateway into your production network. In this position, it has a unique perspective of seeing all traffic, while also being aware of the mapping between web services and servers. A great deal of high level information can be learned from the load balancer with minimal effort, and it is independent of web server, application, and platform. This flow of information is highly consistent, and serves well for baselines. Naturally, we can log from this point (as an alternative to processing web server logs) to gain visibility in web traffic patterns and application behavior.

In this blog, we will take a look at how to pair the Elastic Stack with an F5 Local Traffic Manager, which is great at quickly balancing load across massive amounts of traffic. This is not unlike Elasticsearch & Kibana that gives us ‘wire speed’ access to our data for analysis, allowing us to answer important questions in near real time.

Vertical Bar Chart: “Am I seeing an increase in HTTP 4xx errors?”

Bar Chart.png

Tile Map: “Where is my Traffic Coming from?”

Line Chart: “What are the response times of my Applications?”

Response Times.png

From there you can drill down on these answers by time period, or other HTTP metric , and even compare them side-by-side with other services.

This information is of course useful - and the F5 and Elastic architectures mix particularly well making it quick and easy to get to these answers. The F5 distributes logging traffic across a pool of Logstash Servers, conveniently including information about the Virtual Service.

HSL Pools for Logstash

F5’s High Speed Logging (HSL) mechanism is designed to pump out as much data as can be readily consumed, with the least amount of overhead, to a pool of syslog listeners. As it happens, Elastic Stack is designed to consume data in high volume.

The HSL template packs the information into a parsable string, perfect for Logstash to interpret.

Consult F5’s documentation on Request Logging for the official (and more detailed) procedure, but in brief, one must follow these steps:

Create a Server Pool, pointing to the syslog port on each Logstash instance
Create a Request Logging Profile, specifying the Logstash Server Pool
Assign the Request Logging Profile to Virtual Servers as desired

GROK & The HSL Template

Out of the box, we have accomplished a difficult task easily - we have our logs flowing into Logstash. Next, we need to tell logstash how to parse the messages before indexing them into Elasticsearch. Just like the Logstash cluster is analogous to the HSL Pool, the GROK pattern is analogous to the HSL Request Logging Template. The two can be thought of as a CODEC, converting HTTP request details to a parsable string, that can be subsequently parsed into JSON for output into Elasticsearch.

Here is an example of a request logging template which will give us information about the load balancer’s Virtual Server, Pool Name, Server IP, Response Time, in addition to other common attributes.

$CLIENT_IP $DATE_NCSA $VIRTUAL_IP $VIRTUAL_NAME $VIRTUAL_POOL_NAME $SERVER_IP $SERVER_PORT "$HTTP_PATH" "$HTTP_REQUEST" $HTTP_STATCODE $RESPONSE_SIZE $RESPONSE_MSECS "$Referer" "${User-agent}"

NOTE: The F5 Logging Profile requires configuration of Request and/or Response Logging. The HTTP Response Time variable is naturally only available in the response event.

Now on the other side, in our Logstash cluster, we decode and process the log entry generated by the F5. Logstash has 3 primary phases: Input, Filter, and Output, known as the Logstash Processing Pipeline

Logstash has a built-in syslog input. But because you are decoding the data manually, you should just use raw TCP and/or UDP input plugins.

The core of the filter phase is GROK - an unbiased parsing language - perfect for decomposing the log entry exactly as it was put together. Here is the GROK pattern used to decode the above template.

"%{IP:clientip} \[%{HTTPDATE:timestamp}\] %{IP:virtual_ip} %{DATA:virtual_name} %{DATA:virtual_pool_name} %{DATA:server} %{NUMBER:server_port} \"%{DATA:path}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response:int} %{NUMBER:bytes:int} %{NUMBER:response_ms:int} %{QS:referrer} %{QS:agent}"

After parsing, the filter phase can continue in any way you would like to enrich your data, such as IP address GeoLocation. DNS resolution, or even drop to Ruby for custom manipulation. Finally, the newly created event object is indexed into Elasticsearch via the Logstash Elasticsearch output plugin.

Setting up Health Checks

You will want the F5 to periodically check the health of your Logstash instances, to insure that they are alive and ready to accept logs.

When setting up a health check, use caution with UDP. F5’s UDP health check sends an arbitrary string (of your choice), and fails if it receives back an ICMP unreachable response. The caveat is that Logstash will try to process the arbitrary string, so be sure to filter it out!

Logstash also ships with a heartbeat plugin that periodically says ‘I’m OK’. This information can be output to Elasticsearch and checked by the load balancer. To set this up, you can use the following configuration example:

First create your heartbeat index in Elasticsearch, and put this mapping:

PUT /heartbeats/
PUT /heartbeats/_mapping/heartbeat
{
  "properties": {
      "host": {
          "type":"string", "index":"not_analyzed"
      }
  }
}

Then configure logstash to send a heartbeat every 5 seconds:

input {
        heartbeat {
                interval => 5
                type => "heartbeat"
          }
}
output {
        if [type] == "heartbeat" {
          elasticsearch {
                protocol => "transport"
                index => "heartbeats"
                document_id => "%{host}"
          }
        }
}

Finally, configure the load balancer to check Elasticsearch for a heartbeat within the last 15 seconds:

GET http://es1:9200/heartbeats/heartbeat/_search?q=host:logstash1 AND @timestamp:>now-15s

You can test with curl:

curl -s http://es1:9200/heartbeats/heartbeat/_search?q=host:'logstash1'%20AND%20@timestamp:%3Enow-15s

UDP vs TCP for HSL

UDP or TCP can be used with HSL, and there are pros and cons to each. One important thing to note is that existing TCP connections are reused for long periods of time, in order to reduce what would be unrealistic overhead. While helpful, it does cause imperfect load balancing between logstash instances. Consider using a Least Connection Algorithm over Round Robin with TCP Based HSL.

Alternatives for measuring HTTP Response Time

If you don’t have an F5, or if you are more comfortable working within the bounds of your automation infrastructure, glean the information from web servers via Logstash or Beats.

Apache’s Default Format, with corresponding GROK:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\""
"%{COMBINEDAPACHELOG}"

Modified for Response Time:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %D"
"%{COMBINEDAPACHELOG} %{NUMBER:response_microseconds}"