=pod =head1 NAME B - real time log splitter and viewer =head1 SYNOPSIS B [ B<-c=>F ] [ B<-d> [=F] ] [ B<-g=>I ] [ B<-o> ] [ B<-t> [=I<-1>] ] [ B<-u=>I ] [ B<-V> ] =head1 DESCRIPTION B grew out of the desire to have Apache direct log output to multiple files dependent upon various parameters. While S> is pretty nifty it's basically an if-then-else construct, and I desired more of a switch-case one. Hence B was born. There are probably other means of achieving similar results. For one method that comes to mind see L and L. But this happened to suit my needs. =head1 OPTIONS =over 8 =item c Use the configuration F specified by the switch. =item d Open F in clobber mode for debug output. If a file is not specified F is used instead. =item g B will run as the group specified. =item o B will I pre-open all of C<%FILES> for you. You'll need to use printFlog instead of print. See L. =item t Test the configuration F specified by B. To see only errors without warnings, set B to -1. =item u B will run as the user specified. =item V Report miscellaneous information about B. =back =head1 CONFIGURATION Configuration of B is meant to be fairly straight forward. The configuration file specified by B should contain at least a hash named FILES (C<%FILES>), a call to C, and a subroutine named flog, in that order. See the F directory that came with your distribution for examples. =head2 Required =over 8 =item C<%FILES> = (FILEHANDLE=>[">>path/to/file", buffered]); The keys for C<%FILES> are file descriptor names which B will automatically open for you. The values for C<%FILES> are anonymous arrays. The first index is the file name to be associated with the key/file handle. If the second index is present and true, the file handle will be unbuffered. See L. =item C A call to this function after declaring C<%FILES>. Hopefully self-explanatory. B prepares itself by reading in C<%FILES>. =item C_{The subroutine called for each log entry which contains the logic to
determine where the entry should be output to. It takes no arguments.

=back

=head2 Optional

Your subroutine flog may contain calls to the following functions.

=over 8

=item C(B, [ B ])

C must be used in lieu of C if flog is run with B,
since B will not pre-open all of C<%FILES>.
C's first argument is the filehandle to print to
(the same as C) and will be opened if it is not already.
Its second argument determines what happens to the filehandle
after the log entry is printed.

If the second argument is

=over 4

=item false or undefined

The filehandle is kept open indefinitely

=item 1

The filehandle to be closed after writing.

=item 2

A small pool of the filehandles that you use C
with will be kept open. The size of this pool defaults to 16 and
may be controlled by setting C<$MAXOPEN> in your config file. See L.

=back

NOTE: You may intermix the use of C,
C and C within the same
configuration.

=item C([B])

Typically called at the beginning of C_{this will modify the log
entry to make the IP address field fixed width. e.g.

S<209.54.25.212 - - [01/Aug/2000:04:06:26 +0000]> ...
S<127.0.0.1 - - [01/Aug/2000:04:18:09 +0000]> ...

S<209.54.25.212 - - [01/Aug/2000:04:06:26 +0000]> ...
S<127.0.0.1 - - [01/Aug/2000:04:18:09 +0000]> ...

If you are using CLF (Common Log Format), providing a true value
as an option will make this more efficient.

=back

=head1 SIGNALS

=over 8

B will respond to various signals.
Listed below are those that have special meaning.

=item HUP

Close all files (implicitly flush buffers) and reread the configuration file.
Will also reopen all of C<%FILES> unless B is was used when B
was originally started.

=item INT, PIPE, TERM

Close all files (implicitly flush buffers) and exit.

=item USR1

Explicitly flush I buffers.

=item USR2, CONT, TSTP

Toggle the pausing of output (queueing input) useful for log rotation etc.
It would be best to send a USR1 before doing this.
I

=back

=head1 SEE ALSO

B, B.

Documentation on Apache log file-pipe configuration is available from
http://www.apache.org/docs/mod/mod_log_config.html#transferlog

=head1 CAVEATS

=head2 Miscellaneous

The B and B options only work if the current user is allowed to setuid,
setgid respectively. Typically only root is allowed this privilege.

Some log analysis programs may not be robust enough to
handle output created through the use of C.

=head2 What is your quest?

Given all of the output options B provides, it can be somehwat
difficult what to use. Below is a table to help with this.

Function keepOpen Buffered Filehandles Load

print - N -- High
printFlog 0 N - High #
printFlog 1 N + Low *
printFlog 2 N ++ Med.

print - Y -- High
printFlog 0 Y - High #
printFlog 1 Y + Low *
printFlog 2 Y ++ Med.

* While these are effectively the same, the latter is prefered.
# Emulate print when running with B.

keepOpen the second argument you should call printFlog with
Buffered the second element in the array for %FILES is false
Filehandles the quantity of other logs you will be printing to
Load the frequency this log will be printed to

A more thorough explanation

In most stdio implementations, the type of output buffering and the
size of the buffer varies according to the type of device. Disk files
are block buffered, often with a buffer size of more than 2k.

--- from L

4k seems pretty typical, to get an idea what 4k of log is
examine the 4k-block file in the root directory of the
flog package.

Pros

Buffering helps your system reduce I/O. Big deal you say? Well,
more often than not I/O is the main bottleneck in modern systems,
especially on web servers. So you typically want to do as little
writing to the disk as possible. (Disk caches take care of limiting
the reading [Yes, they help with writing too...])

Closing and reopening a file as C
does is moderately expensive, since B forces a buffer flush.

Cons

B's buffered logs are slightly more prone to
data loss or corruption than an unbuffered log.
If the system undergoes a disorderly shutdown,
i.e. flog does not receive a recognized signal,
it won't have a chance to flush its buffers.
Since the buffer is block/byte-wise and not line-
wise, the last write may not have ended with a
newline. If this were to occur your log may become
corrupt. However this could easily be fixed with
a text editor.

Each open buffered log will consume 0-4k (YMMV) for the actual data buffer,
and a filehandle.

NOTE

You may decrease your chances of data loss by periodically sending a SIGUSR1
to force a flush. This may be especially wise if some of the logs you gather
are rather low traffic.

=head2 Common Sense

Having multiple handles to the same file is hazardous to your health.
For example

%FILES = (
A=>[">>a"],
B=>[">>a"]);
OR

%FILES = (
A=>[">>a"],
B=>[">>a", 1]);
OR

%FILES = (
A=>[">>a"]);

printFlog(A, 1) ... ;
printFlog(A, 2) ... ;

The last one is not so obvious, but because of the way FileCache does
it's thing, B is really using a filehandle with a name other than C
when you use C.

Another silly thing to do is:

%FILES = (
A=>[">a"]);

printFlog(A, 1) ... ;

This will clobber your file I time you try to print to it.
But then again, maybe that's what you want.

=head1 TODO

Build in log rotation, or support external rotation by
flushing then pausing output during rotation.

=head1 SEE ALSO

B, B

=head1 LICENSE

GPL, LGPL or Perl artistic license.

=head1 AUTHOR

Bug reports to: webmaster@pthbb.org

=cut}}