Introduction
File rotation, or log file rotation may seem simple, but for most high throughput applications such as web servers the problems of multiple threads or processes logging to the same file pose problems.
In this example we have a multi process multi thread service which logs incoming requests to a file, the process is written for speed, hence the logging overhead has to be very light, the code is running on a linux base OS.
The file rotation can be rotated by either,
- the client program, in which case all processes/threads must cooperate together
- a separate rotate program, which signals to the clients to close the file and open a new file
Evaluation
Initially looking at current applications such as apache and mysql,
APACHE: The apache webserver on startup opens a file or pipe to a program, this file descriptor is passed to all child processes and threads which log to this file. This model does not rotate files, but if the pipe is opened to a seperate process (rotatelogs) then this process can handle writing to a rotated file. This model seems tidy, but relies on a context switch to ensure the data is logged, if many processes are logging large amounts of data it doubles the number of required context switches to log the data.
MYSQL: Mysql opens the file for append, writes the log line then closes the file, hence is not optimised for speed, when logging lots of data many fopen and fclose system calls are handled.
The optimal solution will:-
- allow the file to remain open
- require minimal overhead for checking for rotation
- allow a separate program to rotate removing the rotation overhead from each process/thread
Solution
The solution required a means for a rotation process to signal to the logging server(s) that the file has been rotated. Renaming of a file does not effect the file handle properties, but if the file permissions or owner are changed the logging client can fstat the handle and check for a permission/ower change before a write. The fstat system call is faster than a normal stat since it uses the inode stored in file handle.
Since the fstat call is very lightweight, and will be kernel optimised by the use of the buffer cache the client log process need only open the file for append, then before each write fstat the handle and check to see if the owner or permissions have changed, and if so close the file and reopen a new file. Using this schema processes which have a number of workers can still all write to a log file concurrently as the write (with append) is atomic.
The rotation process simply renames the file to a file with a suitable date and change the owner or permissions.
There is a tiny race condition between the client fstat and write which would cause the client to write a log line to the end of a rotated file in rare conditions, but since this is a tiny time period, as long as processing of the rotated log file is deferred for a suitable period following rotation all will b OK.
An example Perl implementation of a client logger is provided, it should be noted that this uses a stat on the filehandle which Perl actually calls fstat internally (running strace on the process determined this) – download this file.
Limitations
Unfortunately NFS has limitations (unsure of v4), a linux manual page states:-
O_APPEND The file is opened in append mode. Before each write(), the file offset is positioned at the end of the file, as if with lseek(). O_APPEND may lead to corrupted files on NFS file systems if more than one process appends data to a file at once. This is because NFS does not support appending to a file, so the client kernel has to simulate it, which can't be done without a race condition.
So writing to an NFS file using append may not work.