=pod =head1 NAME B - real time log splitter and viewer =head1 SYNOPSIS B [ B<-c=>F ] [ B<-d> [=F] ] [ B<-g=>I ] [ B<-o> ] [ B<-t> [=I<-1>] ] [ B<-u=>I ] [ B<-V> ] =head1 DESCRIPTION B grew out of the desire to have Apache direct log output to multiple files dependent upon various parameters. While S> is pretty nifty it's basically an if-then-else construct, and I desired more of a switch-case one. Hence B was born. There are probably other means of achieving similar results. For one method that comes to mind see L and L. But this happened to suit my needs. =head1 OPTIONS =over 8 =item c Use the configuration F specified by the switch. =item d Open F in clobber mode for debug output. If a file is not specified F is used instead. =item g B will run as the group specified. =item o B will I pre-open all of C<%FILES> for you. You'll need to use printFlog instead of print. See L. =item t Test the configuration F specified by B. To see only errors without warnings, set B to -1. =item u B will run as the user specified. =item V Report miscellaneous information about B. =back =head1 CONFIGURATION Configuration of B is meant to be fairly straight forward. The configuration file specified by B should contain at least a hash named FILES (C<%FILES>), a call to C, and a subroutine named flog, in that order. See the F directory that came with your distribution for examples. =head2 Required =over 8 =item C<%FILES> = (FILEHANDLE=>[">>path/to/file", buffered]); The keys for C<%FILES> are file descriptor names which B will automatically open for you. The values for C<%FILES> are anonymous arrays. The first index is the file name to be associated with the key/file handle. If the second index is present and true, the file handle will be unbuffered. See L. =item C A call to this function after declaring C<%FILES>. Hopefully self-explanatory. B prepares itself by reading in C<%FILES>. =item C The subroutine called for each log entry which contains the logic to determine where the entry should be output to. It takes no arguments. =back =head2 Optional Your subroutine flog may contain calls to the following functions. =over 8 =item C(B, [ B ]) C must be used in lieu of C if flog is run with B, since B will not pre-open all of C<%FILES>. C's first argument is the filehandle to print to (the same as C) and will be opened if it is not already. Its second argument determines what happens to the filehandle after the log entry is printed. If the second argument is =over 4 =item false or undefined The filehandle is kept open indefinitely =item 1 The filehandle to be closed after writing. =item 2 A small pool of the filehandles that you use C with will be kept open. The size of this pool defaults to 16 and may be controlled by setting C<$MAXOPEN> in your config file. See L. =back NOTE: You may intermix the use of C, C and C within the same configuration. =item C([B]) Typically called at the beginning of C this will modify the log entry to make the IP address field fixed width. e.g. S<209.54.25.212 - - [01/Aug/2000:04:06:26 +0000]> ... S<127.0.0.1 - - [01/Aug/2000:04:18:09 +0000]> ... S<209.54.25.212 - - [01/Aug/2000:04:06:26 +0000]> ... S<127.0.0.1 - - [01/Aug/2000:04:18:09 +0000]> ... If you are using CLF (Common Log Format), providing a true value as an option will make this more efficient. =back =head1 SIGNALS =over 8 B will respond to various signals. Listed below are those that have special meaning. =item HUP Close all files (implicitly flush buffers) and reread the configuration file. Will also reopen all of C<%FILES> unless B is was used when B was originally started. =item INT, PIPE, TERM Close all files (implicitly flush buffers) and exit. =item USR1 Explicitly flush I buffers. =item USR2, CONT, TSTP Toggle the pausing of output (queueing input) useful for log rotation etc. It would be best to send a USR1 before doing this. I =back =head1 SEE ALSO B, B. Documentation on Apache log file-pipe configuration is available from http://www.apache.org/docs/mod/mod_log_config.html#transferlog =head1 CAVEATS =head2 Miscellaneous The B and B options only work if the current user is allowed to setuid, setgid respectively. Typically only root is allowed this privilege. Some log analysis programs may not be robust enough to handle output created through the use of C. =head2 What is your quest? Given all of the output options B provides, it can be somehwat difficult what to use. Below is a table to help with this. Function keepOpen Buffered Filehandles Load print - N -- High printFlog 0 N - High # printFlog 1 N + Low * printFlog 2 N ++ Med. print - Y -- High printFlog 0 Y - High # printFlog 1 Y + Low * printFlog 2 Y ++ Med. * While these are effectively the same, the latter is prefered. # Emulate print when running with B. keepOpen the second argument you should call printFlog with Buffered the second element in the array for %FILES is false Filehandles the quantity of other logs you will be printing to Load the frequency this log will be printed to A more thorough explanation In most stdio implementations, the type of output buffering and the size of the buffer varies according to the type of device. Disk files are block buffered, often with a buffer size of more than 2k. --- from L 4k seems pretty typical, to get an idea what 4k of log is examine the 4k-block file in the root directory of the flog package. Pros Buffering helps your system reduce I/O. Big deal you say? Well, more often than not I/O is the main bottleneck in modern systems, especially on web servers. So you typically want to do as little writing to the disk as possible. (Disk caches take care of limiting the reading [Yes, they help with writing too...]) Closing and reopening a file as C does is moderately expensive, since B forces a buffer flush. Cons B's buffered logs are slightly more prone to data loss or corruption than an unbuffered log. If the system undergoes a disorderly shutdown, i.e. flog does not receive a recognized signal, it won't have a chance to flush its buffers. Since the buffer is block/byte-wise and not line- wise, the last write may not have ended with a newline. If this were to occur your log may become corrupt. However this could easily be fixed with a text editor. Each open buffered log will consume 0-4k (YMMV) for the actual data buffer, and a filehandle. NOTE You may decrease your chances of data loss by periodically sending a SIGUSR1 to force a flush. This may be especially wise if some of the logs you gather are rather low traffic. =head2 Common Sense Having multiple handles to the same file is hazardous to your health. For example %FILES = ( A=>[">>a"], B=>[">>a"]); OR %FILES = ( A=>[">>a"], B=>[">>a", 1]); OR %FILES = ( A=>[">>a"]); printFlog(A, 1) ... ; printFlog(A, 2) ... ; The last one is not so obvious, but because of the way FileCache does it's thing, B is really using a filehandle with a name other than C when you use C. Another silly thing to do is: %FILES = ( A=>[">a"]); printFlog(A, 1) ... ; This will clobber your file I time you try to print to it. But then again, maybe that's what you want. =head1 TODO Build in log rotation, or support external rotation by flushing then pausing output during rotation. =head1 SEE ALSO B, B =head1 LICENSE GPL, LGPL or Perl artistic license. =head1 AUTHOR Bug reports to: webmaster@pthbb.org =cut