A typical line in an access_log looks like the following:
63.203.109.38 - - [02/Sep/2003:09:51:09 -0700] "GET /index.php HTTP/1.1" 301 248 "http://test.com/xxx.php" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"Table 1 shows the value, by column, for the common log format.
Table 1: Common Log File Layout | |
---|---|
Column | Value |
1 | IP of host accessing the server |
2-3 | Security information for https/SSL connections |
4 | Date and time zone offset of the specific request |
5 | Method invoked |
6 | URL requested |
7 | Protocol used |
8 | Result code |
9 | Number of bytes transferred |
10 | Referrer |
11 | Browser identification string |
將下列檔案存成: calculate_log.sh
#!/bin/sh
# webaccess - analyze an Apache-format access_log file, extracting
# useful and interesting statistics
bytes_in_gb=1048576
host="self.com"
if [ $# -eq 0 -o ! -f "$1" ] ; then
echo "Usage: $(basename $0) logfile" >&2
exit 1
fi
firstdate="$(head -1 "$1" | awk '{print $4}' | sed 's/\[//')"
lastdate="$(tail -1 "$1" | awk '{print $4}' | sed 's/\[//')"
echo "Results of analyzing log file $1"
echo ""
echo " Start date: $(echo $firstdate|sed 's/:/ at /')"
echo " End date: $(echo $lastdate|sed 's/:/ at /')"
hits="$(wc -l < "$1" | sed 's/[^[:digit:]]//g')"
echo " Hits: $hits (total accesses)"
pages="$(grep -ivE '(.txt|.gif|.jpg|.png)' "$1" | wc -l | sed 's/[^[:digit:]]//g')"
echo " Pageviews: $pages (hits minus graphics)"
totalbytes="$(awk '{sum+=$10} END {print sum}' "$1")"
echo -n " Transferred: $totalbytes bytes "
# now let's scrape the log file for some useful data:
echo ""
echo "The ten most popular pages were:"
awk '{print $7}' "$1" | grep -ivE '(.gif|.jpg|.png)' | \
sed 's/\/$//g' | sort | \
uniq -c | sort -rn | head -10
# 若是改成這樣, 會把重複ip 排名出來.
# awk '{print $1}' "$1" | \
# sed 's/\/$//g' | sort | \
# uniq -c | sort -rn | head -10
echo ""
echo "The ten most common referrer URLs were:"
awk '{print $11}' "$1" | \
grep -vE "(^"-"$|/www.$host|/$host)" | \
sort | uniq -c | sort -rn | head -10
echo ""
exit 0
執行:
# ./calculate_log.sh /var/log/apache2/access.log
沒有留言:
張貼留言