02 3월 2016 엔지니어링

The evolving story about the Logstash File Input

By Guy Boertje

In this post, we explore the popular file input, and its many use cases. We offer a peek into the evolution of this input and contrast it with alternatives like Filebeat.

Some History (before version 2.2.0)

The file input is a popular way in Logstash to ingest contents of a file. From the onset, it was designed to solve the tailing file use case where the file is being constantly updated. Many users have also tried to use it for uploading static file content into Logstash, and we'll explore this dichotomy in this post.

File input saves information of how far along in the file Logstash has processed so Logstash can resume when it is restarted. For this, it uses sincedb, a file-based persistent store that records the number of bytes read against the file inode (*nix) or identifier (windows).

Most of the real work behind the file input is done by the filewatch library, which is responsible for watching path globs and tailing files.

Filewatch has two phases of operation, discovery and processing. During the first discovery, the watch pattern glob detects files, and each file is put into an internal store. In later discovery phases, only newer files will be put in the store. In the first processing phase, reading: for each stored file, we open, seek to the last byte read position, and read it in 32K chunks. We extract each line from the chunk, send it to the file input, and record the line byte count to the sincedb. Note: The file stays open, and we compare its inode with what we have from the sincedb to decide whether we have seen the file before. In the subsequent processing phase, tailing: if the file has grown, it is read in chunks as above. Some problems with this are:

  1. The files are processed in the order returned by the operating system.
  2. On Logstash restart, from the little data stored in the sincedb, it is hard to know whether a sincedb entry maps accurately to a discovered file. Therefore sometimes we think that some portion of the file has been read when it is really new content or that none of the file was read when it has.
  3. All files remain open — this uses up file handles and makes it harder to work with the files outside of Logstash.

The concept of identity mapping was introduced to the file and other inputs, to ensure that data from the same origin went to the same codec instance, so each time a new identity is seen (path or socket connection), we add a mapping to a clone of the codec specified. To avoid boundless growth of the map, we set the limit to 20,000 identities, because we did not imagine that there would be setups that have more than 20,000 active connections or files being actively tailed. Internally there is a mechanism to evict identities that have not been used for a while (1 hour). Eviction is non-destructive. However in the true tail setup, all the files may be "hot" and their identity (path) may be "hot" too — meaning that the identity may never be evicted. However, if the number of discovered files is greater than 20,000 — they are all opened — then the hard limit will be reached before the eviction mechanism can do anything.

If the number of open file handles hits the limit, then Logstash starts failing to read/write other files e.g., sincedb, templates, and sockets.

In version 2.1.0 we implemented:

  • new config options: ignore_older and close_older.
  • communication of file open and close activity from filewatch to the input to enable identity management in the identity mapper.

In this version the ignore_older behaviour bypasses reading and tailing if a file was last modified more than the setting seconds ago. The close_older behaviour is to close a file if it was last modified more than the setting seconds ago. We found two problems with this release: for ignore_older, files were opened then closed (because the open function did the sincedb update) but this consumed file handles unnecessarily and, for close_older, we should have used a last_read time to decide if a file can be closed.

File Input version 2.2.0

We introduced a tracking object to hold state on each file we discover and process. In discovery for each file path, a tracking object is stored. This object holds the current state and the previous state history as well as many other attributes e.g., path, inode, bytes_read, stat info, and some timestamps. Files satisfying the ignore_older setting are put into the 'ignored' state during discovery and all others are put in the 'watched' state. Files in the ignored state are not opened but are monitored for changes.

We changed the close_older behaviour to use last_read time to decide if a file can be closed. Files in the closed state are monitored for changes.

We have added a max_open_files config option (default 4095). This is implemented as a 'sliding window' of files in the active state across the list of files in the watched state.

CCCCC|AAAAA|WWWWWWWWWW
CCCCCCCCC|AAAAA|WWWWWW
C = Closed, A = Active, W = watched (waiting)

During the processing phase we now have the following steps:

  • check if any 'closed' files have new content and change them to the 'watched' state.
  • check if any 'ignored' files have new content and change them to the 'watched' state.
  • take some 'watched' files and change them to the 'active' state. The amount taken is enough to ensure that the number of 'active' files is never more than the max_open_files setting. A warning is issued from time to time if the max_open_files limit is reached and there are files waiting. The file is opened and read in chunks as before.
  • check each 'active' file for:
    still readable (can stat) if not delete it
    is it closable?, change it to the 'closed' state and evict its identity
    has it changed inode?, close and reopen, state stays 'active'
    has it shrunk?, close and reopen, state stays 'active'
    has it grown?, read more, state stays 'active'

Multiple File Inputs

While it is true that multiple inputs will read files in parallel, inputs fight each other for access to the shared queue which is only drained as fast as the filters + outputs will go. However multiple file inputs are a good idea, if the files can be separated into different directories to overcome the 20K identity limit and to have data from the same time period processed together.

Option Tuning

Tailing Use Case

  • max_open_files: default 4095. Increase this if necessary, but be aware of the file handle limit and that each file input specified in your LS config will open this amount in parallel. If there are more files available to open than this setting, then some will have to wait until others become closable. However, if all the files are constantly getting new content and therefore may never become closable, then some files may never get opened.
  • close_older: default 1 hour. The appropriate setting of this value for tailing varies a bit. If there are some files that have long (5 seconds) update intervals, then a 2 second value will ensure that resources are freed. Be aware of the reliance the max_files_open setting has on the close_older setting when the number of files being tailed is greater than what is set for max_files_open.
  • stat_interval: default, 1 second. This is how long the discovery and processing loop waits before looping round. When the volume of files and data being tailed means that the processing stage takes a significant time, then it is not really necessary to wait before looping round — therefore the stat_interval value can then be reduced to almost zero.
  • discover_interval: default: 15. This is, in reality, the number of times the processing part is done before a discovery is done. If it is known that the files first discovered will never change, then one can set the value quite large, e.g., 1000, meaning less time is spent on discovering nothing.

Read File Use Case (files are content complete)

Sometimes people want to process many, many thousands of files with sizes that vary from hundreds of bytes to gigabytes in size. The file input as it stands now can do this with a combination of max_open_files and close_older settings.

  • max_open_files: default 4095. It is important to know how many file inputs will be opening files on the server across all Logstash processes. In some ways, as it is known that the file will not be appended to, one can set this value and that of close_older quite low, say 64 and 0.1 seconds. So if one has 10 file inputs each on two logstash processes, using the default of 4095, theoretically there can be 81900 files open. Please note that max_open_files really means the number of files open at any time and not the maximum number of files that will ever be read.
  • close_older: default 1 hour. The value that you should set this to is very dependent on the size of the files you are processing and the value of max_open_files. For example if one has 100 1GB files to process, the closable check will only be done at the end of 100 GB of processing. However if one sets the max_open_files to 2, then the closable check will be done after 2 GB of data — in this example a close_older of 1 second will work.  In most read file cases you will set this very low, 1 second or less because the files are not being added to.

Filebeat

Filebeat is a lightweight, open source shipper for log file data. As the next-generation Logstash Forwarder, Filebeat tails logs and quickly sends this information to Logstash for further parsing and enrichment or to Elasticsearch for centralized storage and analysis.

Filebeat is really the way forward for the tailing use case.  It is constantly evolving to support filtering, multiline, and more. You can find out more about Filebeat here.

In many cases it is preferable to install the lightweight Beats components on the machines that generate the logs.

Future

What needs fixing in the file input?

The sincedb needs to store more information so better decisions can be made about whether a discovered file has really been seen before.

At the moment, filewatch communicates to the file input via method calls that certain actions, e.g., open, line(data), and close have taken place. The filewatch communicates the path (identity) only once when the file is opened — meaning the file input must hold state, the path, while the file is read. It would be better if the filewatch passed a context object when making calls to the file input. The context object can hold the path, file size, read position, and any other info that establishes the provenance of the line of data.

Filewatch needs to have separate functionality optimised to the read use case. Work in this area has begun for a new input plugin code named 'read_file'.

Features that filewatch will have (features common to the existing file input and the new read_file input):

  • Support for prioritising the order of the files that are activated. Choose between oldest, newest, biggest, smallest, most read, etc.
  • Enhanced sincedb that is actually the watched file store persisted.
  • Breadth first processing of files — striped reading of chunks from each file so that data from different files, but belonging to roughly the same time period, are more likely to be processed together.

Features that the Logstash read_file input plugin will have:

  • close the file immediately without a close_older setting.
  • better support the signaling of when the input is done with a file, maybe by updating the sincedb record, appending the path to a well known file, or writing to a socket.
  • support for gzip decompression. The tricky bit with gzip decompression is that if Logstash restarts before the file is completely read, the behaviour will be different from a regular file — it should restart at the beginning. It will also be a challenge to use breadth first processing of files if one of them is a gzip file. We should try to use streaming decompression because we need to prevent the whole file from being decompressed into memory before being read.

While we develop these features, we would love to hear about the read file use case to make sure that we are building the right thing. Please open an issue in GitHub with a short description of your setup and expectations. You could also find us on our forum or tweet us.