Apache log secrets

24 02 2008

Previously I have shown my preferred methods of rotating log files here and here.  But Apache has a couple hidden gotchas and enhancements that can be made relating to logs.

Firstly the logs are not in true chronological order, the time stamps are at the start of the request, whilst the log data is actually recorded when the request finishes.  This means that long requests could be logged after short requests and the time stamps will be out of sequence.  This can mess with some log processors such as webalizer, which will drop out of sequence requests by default.  I have created a small utility called logsort which sorts out of order log files using minimal resources.

Another hiccup with Apache logging is that since 2.0 it logs how much the data it would send regardless of whether or not the transaction completes or not.  This can be problematic if you trying to figure out how much bandwidth a client is using based on the log files.  Luckily there is a work around in the name of mod_logio.  Simply enable mod_logio (in most Linux distros this means un-commenting it in httpd.conf) then look for your log line which will be something like:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

and change it to:

LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined

This will also include headers transferred.  You could also do this to record the amount of data recieved too (for example if there is a lot of binary post data):

LogFormat "%h %l %u %t \"%r\" %>s %I/%O \"%{Referer}i\" \"%{User-Agent}i\"" combined

This will produce:

[24/Feb/2008:14:19:07 +0000] "GET / HTTP/1.1" 200 104/7918 "" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.11) Gecko/20061201 Firefox/2.0.0.11 (Ubuntu-feisty)"

As a bonus you can also log the amount of time a request takes from start to completion (so processing time + transfer time) by doing:

LogFormat "%h %l %u %t \"%r\" %>s %I/%O \"%{Referer}i\" \"%{User-Agent}i\" %T/%D" combined

Which will give you:

[24/Feb/2008:14:19:07 +0000] "GET / HTTP/1.1" 200 104/7918 "" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.11) Gecko/20061201 Firefox/2.0.0.11 (Ubuntu-feisty)" 0/95491


Bookmark Apache log secrets  at del.icio.us Digg Apache log secrets

Apache log rotation - part 2

10 02 2008

In part 1 of my article on Apache log rotation I talked about the advantages of using cronolog compared to logrotate.  The disadvantage to this is the cronolog has no archive compression or expiry built in, so I created a bash script to be run on as a daily cron job to do this for you:

#!/bin/bash
# Number of days before gzipping
ZTIME=1
# Number of days before deletion
DTIME=28
gzip `find /var/log/httpd/ -not -name "*.gz" -mtime +$ZTIME -mindepth 1 -type f -print`
find /var/log/httpd/ -mtime +$DTIME -mindepth 1 -type f -print | xargs rm -f {}

This script simply compresses any log files over 1 day old and deletes anything over 28 days old. Simply change the ZTIME and DTIME parameters to modify the compression and deletion time scales. This of course could be modified to move the older log files to another partition instead of delete files or something similar.


Bookmark Apache log rotation - part 2  at del.icio.us Digg Apache log rotation - part 2

Apache log rotation

10 02 2008

If you are using a bog standard Linux setup you are probably using Logrotate to do a daily rotate of your logs.  This works really well although I have come across a couple of gotchas before such as Apache writing to the .1 file instead of the new log file (I'm guessing the reload didn't work).

I now personally use a piped log via. cronolog which fixes the above problem and has the added advantage of being able to rotate logs whenever you like without having to do anything to Apache.

One setup I have worked on generated over 2GB of access log files per day per server using Apache 2.0.x, this meant that with a daily log rotation Apache would die soon after it hit 2GB.  I suggested the use of cronolog, and with this we did:

CustomLog "|/usr/sbin/cronolog -e /var/log/httpd/access_log_%Y-%m-%d-%H.log" combined
ErrorLog "|/usr/sbin/cronolog -e /var/log/httpd/error_log_%Y-%m-%d-%H.log"

So every hour this will automagically create a log file in the form access_log_2008_02_10_08.log for all accesses today 8AM - 9AM. This also makes it easier to debug using the error logs as it makes them a lot smaller so you can find an access at a certain time stamp quickly

You can even have cronolog split into separate dated folders:

ErrorLog "|/usr/sbin/cronolog -e /var/log/httpd/%Y/%m/%d/error_log_%H.log"

Will produce logs in folders such as /var/log/httpd/2008/02/08/error_log_08.log for an 8AM - 9AM log.

I shall follow this up later with a simple way of expiring old logs.


Bookmark Apache log rotation  at del.icio.us Digg Apache log rotation

Broken multi dimension PHP server variables

29 11 2007

At The First Post we have several development environments all with different configurations. Now with this setup it is really easy to overwrite one configuration file by accident (it has never happened but you get the idea). Now with Apache you can use the SetEnv command to set variables (more like constants) which can be picked up by the $_SERVER superglobal.

I thought it would be an interesting experiment to find out what would happen if you tried to create an array using SetEnv, for example:

SetEnv MY_ARRAY[] 'first element'
SetEnv MY_ARRAY[] 'second element'

I tested this on a VM using Apache 2.0.52 and PHP 5.1.6, the results were really surprising, at first it worked! Then it didn't… WTF happened there then???

It turns out that the first time an Apache processes was used it worked, the second time the same process was used the first element was showing only as a variable rather than an array. I haven't figured out if this is a bug in Apache or if the fact you can do it at all is a bug in Apache. Either way I just stuck to standard SetEnv for the conclusion for the tests.


Bookmark Broken multi dimension PHP server variables  at del.icio.us Digg Broken multi dimension PHP server variables