Understanding Apache and Squid Logs

By Kush Wadhwa, NII Consulting

Welcome to the world of log analysis. Log analysis plays a crucial role in intrusion detection. If the compromised system is running on Linux platform one of the first steps which the investigator will perform is the analysis of log files.

Linux has an ability to store the logs of different services like for apache, squid logs, syslogs, and cron logs. These logs help users to resolve different problems and auditing of user actions. In this article we will talk about apache logs & squid logs since they are very closely related to each other and we will also see how to read apache & squid log files.

In Apache, there are two important log files from the incident response perspective. They are :-

(1)error log

(2)access log

The error log contains messages sent from Apache for errors encountered during the course of operation. This log is very useful for troubleshooting Apache issues on the server side. Location of these files depends on the user; the location is described in the Apache configuration file i.e. /etc/httpd/conf/httpd.conf. Let’s look at the error log configuration code from httpd.conf.

ErrorLog /Path/to/error/log/file

The directives which are used in httpd.conf file are ErrorLog & CustomLog.

Now lets assume our error log file is in /etc/httpd/logs/error.log & access log file is in /etc/httpd/log/access.log. The output of the error.log file will look like this

[Sun Feb 19 13:32:33 2006] [notice] suEXEC mechanism enabled (wrapper:/usr/sbin/suexec)

[Sun Feb 19 13:32:34 2006] [notice] Digest: generating secret for digest authentication…

[Sun Feb 19 13:32:34 2006] [notice] Digest: done

[Sun Feb 19 13:32:34 2006] [notice] LDAP: Built with OpenLDAP LDAP SDK

[Sun Feb 19 13:32:34 2006] [notice] LDAP: SSL support unavailable

[Sun Feb 19 13:32:34 2006] [notice] mod_python: Creating 4 session mutexes based on 256 max processes and 0 max threads.

[Sun Feb 19 13:32:34 2006] [notice] Apache/2.0.52 (Red Hat) configured –resuming normal operations

First field tells the date & time when apache service was made to run. The second entry lists the severity of the error being reported. Actually, the ‘LogLevel’ directive is used to control the types of errors that are sent to the error log by restricting the severity level. The following levels are available, in order of decreasing significance:

emerg- Emergencies – system is unusable. “Child cannot open lock file.Exiting”

alert – Action must be taken immediately. “getpwuid: couldn’t determine user name from uid”

crit- Critical Conditions. “socket: Failed to get a socket, exiting child”

error- Error conditions. “Premature end of script headers”

warn- Warning conditions. “child process 1234 did not exit, sending another SIGHUP”

notice– Normal but significant condition. “httpd: caught SIGBUS, attempting to dump core in …”

info- Informational. “Server seems busy, (you may need to increaseStartServers, or
Min/MaxSpareServers)…”

debug-Debug-level messages “Opening config file …”

Third field will show the error which blocked the Apache service or the steps taken while starting apache service.

Now we will talk about the second important file i.e., access log file.
Access log file will give web page access information. Output of access log file will look like this

192.168.0.231 – – [19/Feb/2006:13:53:58 +0530] “GET / HTTP/1.1” 304 – “-” “Mozilla/5.0 (X11; U; Linux i686; en-US; rv: 1.7.5)

Gecko/20041215 Firefox/1.0 Red Hat/1.0-12.EL4”

192.168.0.231 – – [19/Feb/2006:13:53:58 +0530] “GET /favicon.ico HTTP/1.1” 404 289 “-” “Mozilla/5.0 (X11; U; Linux i686;

en-US; rv: 1.7.5) Gecko/20041215 Firefox/1.0 Red Hat/1.0-12.EL4”

192.168.0.231 – – [19/Feb/2006:13:54:35 +0530] “GET /favicon.ico HTTP/1.1” 404 290

192.168.0.231 – – [19/Feb/2006:13:54:35 +0530] “GET / HTTP/1.1” 200 12

To understand access logs better, lets look at this example.

192.168.0.231 – – [19/Feb/2006:13:53:58 +0530] “GET /favicon.ico HTTP/1.1” 404 289 “-” “Mozilla/5.0 (X11; U; Linux i686;

  1. 192.168.0.231– Hostname or IP Address of a client.
  2. – RFC 931 information
  3. – Username entered by the client for authentication. Since the page does not require username for authentication, the field is blank
  4. [19/Feb/2006:13:53:58 +0530]– Date and time when the page was requested.
  5. GET /favicon.ico HTTP/1.1 -Request method made to access the page. Mostly the method is GET.
  6. HTTP/1.1 – Protocol.
  7. 404 – Status code.404 status code is for “Object not found”. Click here to see other status codes.
  8. 289 – Bytes transferred.
  9. Mozilla/5.0 (X11; U; Linux i686-Browser was used to access the web page & the operating system used.

Squid Logs

Squid is a proxy server which can be configured to pass/restrict sites for a particular domain. The squid logs give valuable information of workloads and performance. Besides storing the access information, it also stores system configuration errors and resource consumption (e.g., memory, disk space) values. A user can easily find squid log files in /var/log/squid. The logs available are

access.log (current)

access.log.0 (last week – uncompressed)

access.log.1.gz (week before and is compressed)

The common format of squid logs is

remotehost rfc931 authuser [date] “request” status bytes “referrer” “user_agent”

remotehost-Remote hostname or IP number.

rfc931-The username associated with the client connection, determined from an Ident (RFC 931) server running on the client host.By default Ident lookups are not made, but may be enabled with the ident_lookup option.

authuser-The username as which the user has authenticated himself. This is only available when using Access Authorization (password protected WWW pages).

[date & time]-Date and time of the request.

“request”-The request line exactly as it came from the client i.e., the file name, and the method used to retrieve it. GET in most of the cases

status-The HTTP status code returned to the client. Whether or not the file was successfully retrieved, and if not, what error message was returned.

bytes-The content-length of the document transferred.

“referrer”-The previous URL visited by the accessor

“user_agent”-Information about the browser used to access

Example of squid file is shown below.

  1. pglab39.pg.cc.md.us – – [13/Mar/1995:09:09:03 -0800] “GET /webcom/gstbk.html HTTP/1.0” 200 35727 “http://www.webcom.com/cgi-bin/cust_sites” “Mozilla/3.0 (Win95; I)”
  2. pglab39.pg.cc.md.us – – [13/Mar/1995:09:17:18 -0800] “GET /webcom/gstbk.html HTTP/1.0” 304 0 “/gstbk.html” “Mozilla/3.0 (Win95; I)”
  3. xroads.wr.usgs.gov – – [13/Mar/1995:09:17:53 -0800] “GET /webcom/order.html HTTP/1.0” 200 2344 “” “Mozilla/3.0 (Macintosh; I; PPC)”
  4. 148.241.22.29 – – [13/Mar/1995:09:21:09 -0800] “GET /webcom/order.html HTTP/1.0” 200 2344 “http://www.lycos.com/” “Mozilla/3.0Gold (Win95; U)”
  5. nameless.house.gov – – [13/Mar/1995:10:00:30 -0800] “GET /webcom/graphics/hp.gif HTTP/1.0” 200 2690 “/order.html” “Mozilla/3.0 (Macintosh; I; PPC)”
  6. splitter.amnh.org – – [13/Mar/1995:10:01:10 -0800] “GET /webcom/order.html HTTP/1.0” 200 2344 “http://www.webcom.com/” “Mozilla/3.0 (Win95; I)”

Now lets look at the first entry in the log file
pglab39.pg.cc.md.us – – [13/Mar/1995:09:09:03 -0800] “GET /webcom/gstbk.html HTTP/1.0” 200 35727 “http://www.webcom.com/cgi-bin/cust_sites” “Mozilla/3.0 (Win95; I)”

In the given log

  1. pglab39.pg.cc.md.us– This represents the remote hostname.
  2. – RFC 931 information.
  3. – The third field is the username which was authenticated. Since there is no Access Authorization i.e. password protected pages so this field is blank.
  4. [13/Mar/1995:09:09:03 -0800]-Date and time when the page was accessed.
  5. “GET /webcom/gstbk.html HTTP/1.0”-Method which was used to retrieve the web page.
  6. 200 – The HTTP status code returned to the client. There are many other status code which are returned to the client.
  7. 35727 – Number of bytes which were transferred.
  8. “http://www.webcom.com/cgi-bin/cust_sites” – Last page which was accessed by the squid proxy server.
  9. “Mozilla/3.0 (Win95; I)”– Browser which was used to access the web page.

This was a very simple and brief discussion on how to interpret the Apache and Squid logs. This is just the first step in the journey towards mastering the art of log analysis.
Happy learning!

Author


7 comments

A nice read! There are few typographical errors. The first snippet of logs introduced are from error.log, not access.log.

Hi Ayaz

Thanks Ayaz for your feedback. I have noticed the errors and made appropriate changes. Looking forward for continued interaction with you.

What version of squid does this apply to?

Hi Eddie

Log formats are commong for all versions of squid till now. I have given the example of squid in RHEL4. Looking forward for continued interaction with you.

I really benefited from this document
thanks.

I am using Ubuntu Linux O/S Please if there is a way you can help me to intreprete my log files I will be very grateful.
thanks.

Hello Sulaiman,

As my knowledge goes, log format of Apache and Squid is same for every flavour of Linux. But if you still think that Apache and Squid log formats are different in Ubuntu, then please e-mail me those logs. I will explain those logs to you.

Regards

Leave a Reply

Your email address will not be published. Required fields are marked *