httpdstats - generate a statistical summary of Apache log files |
httpdstats [-c configfile] [-g grouplist] [-i ignorelist] [-m address] [-s subject] [-x ignorelist] [-bytes] [-count] [-html] [-agent] [-host] [-hostuser] [-protocol] [-referrer] [-request] [-status] [-url] file ... |
httpdstats generates a statistical summary of the contents of an Apache access log and either prints the summary or sends it, via mail, to somebody. The contents of the access log is summarized by various criteria. Summaries are in terms of both the number of requests and the number of bytes transferred. |
At least one log file needs to be specified. If several files are given, the contents of the files are aggregated together. |
To avoid clutter, URLs with certain suffixes, such as .gif or .png, can be grouped together into a single aggregate entry, or ignored altogether. |
-c configfile |
Read configuration options from configfile. |
-g list |
Group requests having a URL with a suffix in list together into an "all suffix" entry. The suffixes should be separated by spaces. |
-i list |
Ignore requests having a URL with a suffix in list. The suffixes should be separated by spaces. |
-m user |
Mail the results to user. |
-n |
Convert the requester's IP address into a hostname, if possible. |
-s subject |
Use subject as the subject line when mailing results. |
-x list |
Ignore requests from systems having a domain or network in list. The list of domains should be separated by spaces. |
-html |
Format output in HTML. |
-bytes |
Print a total count of bytes sent. |
-count |
Print a total count of requests received. |
-agent |
Produce a summary partitioned by the requesting agent (Mozilla, Googlebot, etc.) |
-host |
Produce a summary partitioned by host name (or IP number). |
-hostuser |
Produce a summary partitioned by host name, user and ident name. |
-hour |
Produce a summary partitioned by the hour of the request. |
-protocol |
Produce a summary partitioned by the request protocol. |
-referrer |
Produce a summary by referring URL. |
-request |
Produce a summary partitioned by request type. |
-status |
Produce a summary partitioned by return status. |
-url |
Produce a summary partitioned by the requested URL. |
httpdstats can read configuration options from a configuration file. This file can either be supplied as a command line argument or set by the HTTPDSTATSCONF environment variable (see below). Options read from a configuration file override the defaults and are, in turn, overridden by command line options. |
The format of the configuration file is lines of the form option = value. Lines beginning with a # character or empty lines are treated as comments. |
The main configuration option that can be placed in a configuration file is the report list. An entry in the report list is of the form report = key1:key2:... where each keyn represents a key that can be used when partitioning the statistics. The following keys can be used: |
agent |
The requsting agent (Mozilla, Googlebot, etc.) |
date |
The date that the request was received on. |
host |
The host name or IP address of the requesting system. Host names are sorted in domain order, so all the .com domains are grouped together, and so on. IP addresses are sorted in address order. |
hour |
The hour (and date) that the request was received on. |
ident |
The identity of the requester. The identity is supplied by the requesting host through the auth protocol, if enabled. |
minute |
The minute (and hour and date) that the request was received on. |
protocol |
The HTTP (or other) protocol that was used to make the request. |
referrer |
The referring URL; the URL that links to the requested URL. |
request |
The HTTP request type (GET, POST, etc.) |
status |
The HTTP return status of the request. |
url |
The requested URL. Well, not entirely true; the path on the server. |
user |
The user name supplied to the server as part of its authentication, if enabled. |
These keys can be combined to form more detailed reports. For example, host:user gives the requesting host and requesting user. As another example, url:status gives the requested URL and the returned status. |
Two special report keys are bytes and count. These keys give the total number of bytes sent and the total number of requests received, respectively. These keys cannot be combined with other keys. |
Other options that can be set in the configuration file are: |
agent-style |
How to report the user agent. This option can be set to either full (the default) for all agent information or name for just the initial agent name and version that is usually supplied. |
group-suffixes |
This option contains a list of suffixes where the requested URLs should be collapsed into a single aggregate entry. This option is intended to stop graphical buttons and icons from cluttering up the statistics. The entries are listed as "all suffix". The default entry is .gif .jpg .jpeg .png |
host-cutoff |
If this option is greater than 0, then the lowest part of the fully qualified host name is cut off to that level. For example, with host-cutoff set to 1, itchy.charvolant.org and scratchy.charvolant.org will be collapsed into charvolant.org. With host-cutoff set to 2, then the above two hostnames will be collapsed into a simple .org. This option is intended to allow statistics to be gathered by network; it is not very foolproof at all. See the name-lookup option. The default value is 0 (no cutoff). |
ignore-suffixes |
Like group-suffixes, but the requests are ignored all together. By default, this option is empty. |
ignore-domains |
This option contains a list of domains or IP networks, separated by spaces. Requests coming from these domains (or any sub-domain) are ignored. This option can be set with the local domain to avoid including local requests in the statistics. As an example, a entry of .charvolant.org 10.32.22. will ignore itchy.charvolant.org and 10.32.22.45. See the name-lookup option. |
mail-subject |
The Subject: line to use for mail. By default, this is set to httpd log statistics |
mail-to |
The mail address to send the summary to. If this option is empty (the default) then the summary will be printed to standard output. |
mail-transport-agent |
The program to send mail to. This defaults to /usr/bin/sendmail -bm -t |
name-lookup |
Set to either yes or no (the default). Attempt to convert IP addresses into hostnames, using reverse DNS lookup. |
referrer-style |
How to report the referring URL. The referrer style can be either full (the default) for all information, path to give just the path, without any CGI arguments or domain to give just the domain name of the referrer. |
referrer-max-width |
The maximum width of a displayed referring URL. If set to 0, any width is acceptable. URLs longer than the specified width will be reduced to fit, with three centred dots (···) indicating the removed piece. |
url-style |
How to report the URL request. The URL style can be either full (the default) for all information or path to give just the path, without any CGI arguments. The path option is useful if you don't want huge amounts of argument-specific information cluttering up things, as well as the inevitable script-kiddie attacks on default.ida. |
url-max-width |
The maximum width of a displayed URL. If set to 0, any width is acceptable. URLs longer than the specified width will be reduced to fit, with three centred dots (···) indicating the removed piece. |
use-html |
Set to either yes or no (the default). If set to yes, the generated output is an HTML page, using tables to present the summaries. If set to no, then plain text is output. |
If the environment variable HTTPDSTATSCONF is set, then httpdstats will use it as a file name to read configuration options from. |
/etc/cron.daily/httpstats.cron |
Daily cron invocation to report statistics via mail to the administrator. |
/etc/httpstats.conf |
The configuration file used by the cron invocation. |
Doug Palmer - doug@charvolant.org |
httpd(8), sendmail(8) |