I just checked off another item on the TODO list - set up VPN. This was so simple and quick, I wish I’d done it earlier.

I basically just ran through the instructions at the following 3 links:

Mostly the same instructions apply for Ubuntu 12.10 and OpenVPN 1.8.5.

I made some good progress on the GoAccess plugin over the past few days. Many of the kinks have been ironed out and making access.log reports has never been so easy :)

It’s a pretty simple plugin, but it does the job well. My favorite parts of the script are a fun bit of regex that’s just aesthetically pleasing, the awk date filter, and the overall flow of execution.

``` bash Fun Regex yes=‘^1$|^(y|Y?)$’


``` bash awk Date Filter
cmDATE=`date -u -v-"$goINTVAL"H +\[%d/%b/%Y:%H:%M:%S` # OSX date command format (-v).
    ssh -F $HOME/.ssh/config "$goSRV" \
        "sudo awk -v Date=$cmDATE '\$4 > Date { print \$0 }' /var/log/$tech/$file"

An excellent log analysis tool that I picked up recently from a blog post by my colleague, Amin Astaneh, is the GoAccess interactive web log analyzer. Out of the box, you can unleash GoAccess on raw or piped log data to reveal an array of interesting traffic patterns that might otherwise take some serious piping skills to crack - I covered some of these in a recent blog post.

I’ve dabbled with, attended presentations for, and read about configuration management systems for quite a while. For at least the past year, every time I start up a new project I can’t help but think about all of the benefits that would ensue from wrapping my work in a managed server configuration. I decided recently to bite the bullet and dive into Puppet head.

In getting started, and to keep things simple, I decided to utilize a Vagrant box running 64-bit Ubuntu 12.04.3 LTS (Precise Pangolin). After navigating a couple of small road bumps, things are looking pretty good!

The first bump in the road was getting the following message when trying to run puppet despite setting the hostname from within the VM:

``` text Warning Warning: Could not retrieve fact fqdn Warning: Host is missing hostname and/or domain: bump


The easy way to solve this one is to simply declare the hostname in the Vagrantfile for the VM as follows:

``` ruby Vagrantfile
config.vm.hostname = "bump"

The next little hurdle was learning about modules and using “–modulepath”. In this case I set an include in the nodes.pp file to pull in a simple module and ran the following command to get the message below:

``` bash Command sudo puppet apply ./manifests/site.pp


``` text Error
Error: Could not find class hurdle for bump on node bump

The fix in this case as implied by the lead-in is to pass the path to the ‘modules’ directory in the ‘puppet apply’ command:

bash Command sudo puppet apply ./manifests/site.pp --modulepath=/path/to/puppet/modules/

That’s all. Pretty pumped to find new and exciting hurdles with Puppet.

With analyzing data sets and performing repetitive (and tedious) tasks, piping streams in the terminal can save a ridiculous amount of manual labor and time. Following are a few examples that I use in diagnosing and resolving various types of problems with web applications.

Assuming log entries in the following format:

1.1.1.1 - - [24/Aug/2013:12:55:30 +0000] "GET / HTTP/1.0" 200 2227 "-" "USER AGENT STRING" vhost=vhost.example.com host=example.com hosting_site=docroot request_time=5640229

Number of Requests

``` bash Number of requests in a given half hour in 10-minute increments $ for i in {0..2} ; do echo “– 10:$i”“x UTC –” ; grep -c “2013:10:$i” access.log ; echo -e “\n” ; done


__Results:__
<pre>
-- 10:0x UTC --
494

-- 10:1x UTC --
458

-- 10:2x UTC --
446
</pre>

### HTTP Response Codes

``` bash Sort HTTP response codes in a given half hour in 10-minute increments
$ for i in {0..2} ; do echo "-- 10:$i""x UTC --" ; grep "2013:10:$i" access.log | cut -d\" -f3 | awk '{print $1}' | sort | uniq -c | sort -nr ; echo -e "\n" ; done

Results:

– 10:0x UTC –
 337 200
 108 302
  31 304
  11 301
   7 404

– 10:1x UTC – 283 200 110 302 39 301 21 304 5 403

– 10:2x UTC – 280 200 116 302 46 301 2 404 2 403

Most Requested URIs

```bash Most-requested URIs in a given half hour in 10-minute increments for i in {0..2} ; do echo “– 10:$i”“x UTC –” ; grep “2013:10:$i” access.log | cut -d\” -f2 | awk ‘{print $2}’ | cut -d\? -f1 | sort | uniq -c | sort -nr | head -n 5 ; echo -e “\n” ; done


__Results:__
<pre>
-- 10:0x UTC --
  64 /example/one
  48 /example/two
  32 /example/three
  16 /example/four
   9 /example/five

-- 10:1x UTC --
  62 /example/two
  55 /example/three
  30 /example/six
  27 /example/one
  20 /example/four

-- 10:2x UTC --
  58 /example/one
  34 /example/three
  33 /search
  31 /rss.xml
  27 /example/two
</pre>

### Most Active IPs

``` bash Most active IPs in a given half hour in 10-minute increments
for i in {0..2} ; do echo "-- 10:$i""x UTC --" ; grep "2013:10:$i" access.log | awk '{print $1}' | sort | uniq -c | sort -nr | head -n 5 ; echo -e "\n" ; done

Results:

– 10:0x UTC –
 145 1.1.1.1
  73 2.2.2.2
  29 3.3.3.3
  25 4.4.4.4
  13 5.5.5.5

– 10:1x UTC – 153 1.1.1.1 76 3.3.3.3 32 5.5.5.5 29 2.2.2.2 18 4.4.4.4

– 10:2x UTC – 131 2.2.2.2 61 4.4.4.4 38 5.5.5.5 34 3.3.3.3 10 1.1.1.1