suPerlative: Log File Analysers
The ability to extract statistical information from the server's log
files is crucial to shaping your site to better serve users' needs.
Our best example of that is the
Top 100 page, which shows us -
and our visitors, which pages are the most popular. This enables us to
focus attention on those pages, for updates and improvements, and
indicates where we might need to add more content.
Seeing which pages are the most popular, we can ensure that the
navigation menu items are chosen to allow rapid access to those
pages.
t100.sh
creates the top 100 report from the previous day's access log.
It performs file administration and calls log.pl and top.pl.
log.pl reads the server's access log
and extracts page URLs, counting the accesses to each one.
It outputs a file containing records specifying for each URL and
the number of successfull page views.
top.pl reads this file and creates the
Top 100 page. Sometimes we modify this program to output a larger number
of the most popular files, e.g. the top 200, for private use only.
This could be done with one program and no intermediate file,
but splitting into two improves flexibility, e.g. if there is some fault
in the generated HTML file, we don't have to read the server log again
(a slow process).
top.pl uses ht_subs.pl.
Another very useful thing to know about is which pages sent the
most visitors to us. If they are relevant to web development, and
are of clear merit, we might want to link to them. In some cases
backlinks indicate someone linking directly to an image on our server
- which is not a good idea.
ref_cnt.pl
and
rtop.pl
operate similarly to log.pl and top.pl, scanning the access log to
create an intermediate file of referring pages and printing an HTML
file of the top 100.
find_error.pl
reads the access log and prints a file of the broken links within the
WDVL site, i.e. our own broken links to WDVL.
suPerlative: The ht Preprocessor
suPerlative Web Construction !
suPerlative: The Site Map
|