With analyzing data sets and performing repetitive (and tedious) tasks, piping streams in the terminal can save a ridiculous amount of manual labor and time. Following are a few examples that I use in diagnosing and resolving various types of problems with web applications.
Assuming log entries in the following format:
1.1.1.1 - - [24/Aug/2013:12:55:30 +0000] "GET / HTTP/1.0" 200 2227 "-" "USER AGENT STRING" vhost=vhost.example.com host=example.com hosting_site=docroot request_time=5640229
Number of Requests
$ for i in {0..2} ; do echo "-- 10:$i""x UTC --" ; grep -c "2013:10:$i" access.log ; echo -e "\n" ; done
Results:
HTTP Response Codes
$ for i in {0..2} ; do echo "-- 10:$i""x UTC --" ; grep "2013:10:$i" access.log | cut -d\" -f3 | awk '{print $1}' | sort | uniq -c | sort -nr ; echo -e "\n" ; done
Results:
Most Requested URIs
for i in {0..2} ; do echo "-- 10:$i""x UTC --" ; grep "2013:10:$i" access.log | cut -d\" -f2 | awk '{print $2}' | cut -d\? -f1 | sort | uniq -c | sort -nr | head -n 5 ; echo -e "\n" ; done
Results:
Most Active IPs
for i in {0..2} ; do echo "-- 10:$i""x UTC --" ; grep "2013:10:$i" access.log | awk '{print $1}' | sort | uniq -c | sort -nr | head -n 5 ; echo -e "\n" ; done
Results: