httpdstats

NAME
SYNOPSIS
DESCRIPTION
OPTIONS
CONFIGURATION
ENVIRONMENT
FILES
AUTHOR
SEE ALSO

NAME

httpdstats - generate a statistical summary of Apache log files

SYNOPSIS

httpdstats [-c configfile] [-g grouplist] [-i ignorelist] [-m address] [-s subject] [-x ignorelist] [-bytes] [-count] [-html] [-agent] [-host] [-hostuser] [-protocol] [-referrer] [-request] [-status] [-url] file ...

DESCRIPTION

httpdstats generates a statistical summary of the contents of an Apache access log and either prints the summary or sends it, via mail, to somebody. The contents of the access log is summarized by various criteria. Summaries are in terms of both the number of requests and the number of bytes transferred.
At least one log file needs to be specified. If several files are given, the contents of the files are aggregated together.
To avoid clutter, URLs with certain suffixes, such as .gif or .png, can be grouped together into a single aggregate entry, or ignored altogether.

OPTIONS

-c configfile
Read configuration options from configfile.
-g list
Group requests having a URL with a suffix in list together into an "all suffix" entry. The suffixes should be separated by spaces.
-i list
Ignore requests having a URL with a suffix in list. The suffixes should be separated by spaces.
-m user
Mail the results to user.
-n
Convert the requester's IP address into a hostname, if possible.
-s subject
Use subject as the subject line when mailing results.
-x list
Ignore requests from systems having a domain or network in list. The list of domains should be separated by spaces.
-html
Format output in HTML.
-bytes
Print a total count of bytes sent.
-count
Print a total count of requests received.
-agent
Produce a summary partitioned by the requesting agent (Mozilla, Googlebot, etc.)
-host
Produce a summary partitioned by host name (or IP number).
-hostuser
Produce a summary partitioned by host name, user and ident name.
-hour
Produce a summary partitioned by the hour of the request.
-protocol
Produce a summary partitioned by the request protocol.
-referrer
Produce a summary by referring URL.
-request
Produce a summary partitioned by request type.
-status
Produce a summary partitioned by return status.
-url
Produce a summary partitioned by the requested URL.

CONFIGURATION

httpdstats can read configuration options from a configuration file. This file can either be supplied as a command line argument or set by the HTTPDSTATSCONF environment variable (see below). Options read from a configuration file override the defaults and are, in turn, overridden by command line options.
The format of the configuration file is lines of the form option = value. Lines beginning with a # character or empty lines are treated as comments.
The main configuration option that can be placed in a configuration file is the report list. An entry in the report list is of the form report = key1:key2:... where each keyn represents a key that can be used when partitioning the statistics. The following keys can be used:
agent
The requsting agent (Mozilla, Googlebot, etc.)
date
The date that the request was received on.
host
The host name or IP address of the requesting system. Host names are sorted in domain order, so all the .com domains are grouped together, and so on. IP addresses are sorted in address order.
hour
The hour (and date) that the request was received on.
ident
The identity of the requester. The identity is supplied by the requesting host through the auth protocol, if enabled.
minute
The minute (and hour and date) that the request was received on.
protocol
The HTTP (or other) protocol that was used to make the request.
referrer
The referring URL; the URL that links to the requested URL.
request
The HTTP request type (GET, POST, etc.)
status
The HTTP return status of the request.
url
The requested URL. Well, not entirely true; the path on the server.
user
The user name supplied to the server as part of its authentication, if enabled.
These keys can be combined to form more detailed reports. For example, host:user gives the requesting host and requesting user. As another example, url:status gives the requested URL and the returned status.
Two special report keys are bytes and count. These keys give the total number of bytes sent and the total number of requests received, respectively. These keys cannot be combined with other keys.
Other options that can be set in the configuration file are:
agent-style
How to report the user agent. This option can be set to either full (the default) for all agent information or name for just the initial agent name and version that is usually supplied.
group-suffixes
This option contains a list of suffixes where the requested URLs should be collapsed into a single aggregate entry. This option is intended to stop graphical buttons and icons from cluttering up the statistics. The entries are listed as "all suffix". The default entry is .gif .jpg .jpeg .png
host-cutoff
If this option is greater than 0, then the lowest part of the fully qualified host name is cut off to that level. For example, with host-cutoff set to 1, itchy.charvolant.org and scratchy.charvolant.org will be collapsed into charvolant.org. With host-cutoff set to 2, then the above two hostnames will be collapsed into a simple .org. This option is intended to allow statistics to be gathered by network; it is not very foolproof at all. See the name-lookup option. The default value is 0 (no cutoff).
ignore-suffixes
Like group-suffixes, but the requests are ignored all together. By default, this option is empty.
ignore-domains
This option contains a list of domains or IP networks, separated by spaces. Requests coming from these domains (or any sub-domain) are ignored. This option can be set with the local domain to avoid including local requests in the statistics. As an example, a entry of .charvolant.org 10.32.22. will ignore itchy.charvolant.org and 10.32.22.45. See the name-lookup option.
mail-subject
The Subject: line to use for mail. By default, this is set to httpd log statistics
mail-to
The mail address to send the summary to. If this option is empty (the default) then the summary will be printed to standard output.
mail-transport-agent
The program to send mail to. This defaults to /usr/bin/sendmail -bm -t
name-lookup
Set to either yes or no (the default). Attempt to convert IP addresses into hostnames, using reverse DNS lookup.
referrer-style
How to report the referring URL. The referrer style can be either full (the default) for all information, path to give just the path, without any CGI arguments or domain to give just the domain name of the referrer.
referrer-max-width
The maximum width of a displayed referring URL. If set to 0, any width is acceptable. URLs longer than the specified width will be reduced to fit, with three centred dots (иии) indicating the removed piece.
url-style
How to report the URL request. The URL style can be either full (the default) for all information or path to give just the path, without any CGI arguments. The path option is useful if you don't want huge amounts of argument-specific information cluttering up things, as well as the inevitable script-kiddie attacks on default.ida.
url-max-width
The maximum width of a displayed URL. If set to 0, any width is acceptable. URLs longer than the specified width will be reduced to fit, with three centred dots (иии) indicating the removed piece.
use-html
Set to either yes or no (the default). If set to yes, the generated output is an HTML page, using tables to present the summaries. If set to no, then plain text is output.

ENVIRONMENT

If the environment variable HTTPDSTATSCONF is set, then httpdstats will use it as a file name to read configuration options from.

FILES

/etc/cron.daily/httpstats.cron
Daily cron invocation to report statistics via mail to the administrator.
/etc/httpstats.conf
The configuration file used by the cron invocation.

AUTHOR

Doug Palmer - doug@charvolant.org

SEE ALSO

httpd(8), sendmail(8)