I poked around with Apache log files today, looking at the log file formats and thinking of ways to look at the data. While a nice log parsing tool like AWStat or maybe a slick one like Mint would be nice, I just wanted to look at the data without too much fuss. Also, it’s not my server. So I wrote something to do what I wanted with grep and sort and uniq. A bit afterwards, I realized using awk would give me the fields I wanted from the file, and would be easier & faster than using grep or egrep.
So here’s some useful links,
* Managing and parsing your Apache logs This is good and simple with some useful example scripts.
* Checking your system logs with awk This has some explanations of why awk is handier to use than grep for handling log files.
* I’m growing more and more fond of awk! Why didn’t I know this before – I did everything with Perl, instead! *mad crush on awk*
It was fun doing this and made me want to install Apache and get it running so that I would know its ins and outs a bit better. But that will just have to wait until I finish digging into these log files… It is amazing what you can tell from them and what you can deduce, given a huge web site to sink your teeth into. Right now I’m thinking about bots and spiders, remembering the fun things I used to do with spam hunting, filtering, and watching the habits of users going through a big proxy server.