I recently pieced together a bash wrapper script to create, add, and delete an ssh key for temporary use in performing remote tasks over ssh. The processes outlined here assume cURL is available, and that the remote service you wish to connect to has API methods for ssh key handling.

Automating the process of generating and deleting a local ssh key is the easy part. Here’s one way to create a key:

ssh-keygen -q -b 4096 -t rsa -N "" -f ./scripted.key

Options rundown:

  • -b bit strength -> higher than the default 2048
  • -f filename -> easier to subsequently delete the keypair
  • -N passphrase -> empty string
  • -q quiet mode -> no need to review output
  • -t key type -> specify rsa

And now to delete the newly created key-pair:

rm -r ./scripted.key*

Next we’ll set up the API calls to register and delete a public key on remote services, in this case Github and Acquia.

Github

To automate ssh key deployment on Github, you’ll first want to generate a personal access token under ‘Account settings’ > ‘Applications’. We’ll set a variable to the value of the token for easy re-use:

TOKEN='your-token'

Per the docs, note that the DELETE operation we will eventually employ requires a special admin:public_key permission.

In addition to the token, we’ll set another variable to the value of the ssh public key as follows:

PUBKEY=`cat ./script.key.pub`

Now we can cURL the Github API using the TOKEN and PUBKEY variables. Since we’re setting up for a procedural operation and to reduce the number of network requests, we’ll capture the Github API response (which contains the key ID):

RESPONSE=`curl -s -H "Authorization: token ${TOKEN}" \
  -X POST --data-binary "{\"title\":\"nr@blackhole\",\"key\":\"${PUBKEY}\"}" \
  https://api.github.com/user/keys`

And now to extract the key ID:

KEYID=`echo $RESPONSE \
  | grep -o '\"id.*' \
  | grep -o "[0-9]*" \
  | grep -m 1 "[0-9]*"`

Note that the above is more than we really need to be able to parse the Github response. With the Acquia example (coming up next), we’ll see a good reason for setting up the extraction in this manner.

Only one step left, but you may want to add a 10-second sleep to the script to give an opportunity to verify that the key was added before it is deleted.

And now for the delete:

curl -s -H "Authorization: token ${TOKEN}" -X DELETE \
   https://api.github.com/user/keys/${KEYID} \
  -o /dev/null

Here we’re sending the result to /dev/null to ensure the script stays quiet.

Acquia

Performing this task with Acquia’s Cloud API is much the same, but with a couple of notable differences.

First, we need to set a couple of additional variables:

CLOUDAPI_ID='id'
CLOUDAPI_KEY='key'
DOCROOT="docroot"
CREDS="${CLOUDAPI_ID}:${CLOUDAPI_KEY}"

Variables set, here’s the cURL command to add the key:

RESPONSE=`curl -s -u $CREDS \
  -X POST --data-binary "{\"ssh_pub_key\":\"${PUBKEY}\"}" \
  https://cloudapi.acquia.com/v1/sites/"${DOCROOT}"/sshkeys.json?nickname=script`

In this case, we’re going to extract 2 pieces of data from the response. We’ll need the task ID to track the status of adding the key, and we’ll also need the key ID (as with the Github example) so that we can delete the key:

TASKID=`echo $RESPONSE \
  | grep -o '\"id.*' \
  | grep -o "[0-9]*" \
  | grep -m 1 "[0-9]*"`

KEYID=`echo $RESPONSE \
  | grep -o "sshkeyid.*" \
  | grep -o "[0-9]*" \
  | grep -m 1 "[0-9]*"`

This is where the utility of the extra bash logic comes in handy, as the Acquia response is condensed JSON, whereas the Github response is readable JSON. Since we don’t have things nicely separated into lines, and since we want to minimize dependencies (this is where I’d otherwise recommend jq), the above gives us what we need with fairly low overhead.

Now to query the task ID so we know when we can start using our key:

STATUS='null'
until [[ $STATUS =~ ^error|done$ ]]; do
  STATUS=`curl -s -u $CREDS \
  https://cloudapi.acquia.com/v1/sites/"${DOCROOT}"/tasks/"${TASKID}".json \
  | grep -o 'state.*' \
  | grep -o '[a-z]*' \
  | sed -n 2p`
  echo "ADDING SSH KEY: ${STATUS}"
  sleep 5
done

And finally, here’s the delete:

curl -s -u $CREDS -X DELETE \
  https://cloudapi.acquia.com/v1/sites/"${DOCROOT}"/sshkeys/"${SSHID}".json \
  -o /dev/null

For reference, I set up a Gist that contains complete bash scripts for both services covered above.

Enjoy!

After building a bash script to automate Drupal module deployments, I figured it might be worthwhile to convert the script over to Ruby. I decided to spin up the new version as a Ruby gem leveraging the Thor CLI Framework.

Having already worked out many of the mechanics of deploying Drupal contrib modules in the previous bash script, I was able to dive right into coding. I started by fleshing out the command options and then moved into scripting the functionality. Thor makes it really easy to set up the command interface, though formatting long descriptions can be a little tricky.

In building the script, I wanted to stay faithful to keeping as much of the logic in Ruby as possible. The result was many opportunities to explore Ruby and to make some interesting discoveries. The two areas where I was most tempted to shell out were with identifying and downloading the “best” version of a contributed Drupal module (drush), and with performing version control activities (Git).

In the first case, Nokogiri was an obvious choice for parsing Drupal contrib XML feeds. Fortunately, drupal.org exposes uniform project feeds in the following format:

http://updates.drupal.org/release-history/{project}/{core-version}

Reviewing several project feeds, it wasn’t immediately obvious how to parse a feed to select the “best” project, so I referenced drush source code for pointers:

function updatexml_best_release_found($releases) {
  // If there are releases found, let's try first to fetch one with no
  // 'version_extra'. Otherwise, use all.

The above comment says it all. In the Ruby script, you can see this basic logic is reproduced in contrib.rb (dl method):

    def dl
      doc = Nokogiri::XML(open(@feed).read)
      releases = {}
      doc.xpath('//releases//release').each do |item|
        if !item.at_xpath('version_extra')
          releases[item.at_xpath('mdhash').content] = item.at_xpath('download_link').content
        end
      end
      if releases.nil?
        doc.xpath('//releases//release').each do |item|
          releases[item.at_xpath('mdhash').content] = item.at_xpath('download_link').content
        end
      end
      return releases.first
    end

For downloads of both XML documents and project archives, I wanted to prevent getting myself (or others) blacklisted through unintentionally DOS’ing drupal.org with lots of requests. Here I decided to lean on a small OpenURI extension called open-uri-cached. The way this is implemented is a bit hacky, but it gets the job done for now. For locating cached project archives, you’ll see that I replicated a small bit of logic from open-uri-cached to find and extract archives:

uri = URI.parse(archive)
targz = "/tmp/open-uri-503" + [ @path, uri.host, Digest::SHA1.hexdigest(archive) ].join('/')

Addressing Git functionality was initially not so straight-forward. Following the Git breadcrumbs from Ruby Toolbox, the most obvious place to start is Grit, which “is no longer maintained. Check out rugged.” Rugged was initially promising, but in the end failed to yield a working git push. That left ruby-git as the next logical choice. Fortunately ruby-git did the trick without much fuss:

    def update
      prj_root = Pathname.new(docroot)
      workdir = prj_root.parent.to_s
      project = File.basename(path)

      g = Git.open(workdir)
      g.branch('master').checkout

      changes = []
      g.status.changed.keys.each { |x| changes.push x }
      g.status.deleted.keys.each { |x| changes.push x }
      g.status.untracked.keys.each { |x| changes.push x }

      if changes.nil? == false
        g.add(path, :all=>true)
        g.commit("Adds #{project}")
        g.push
      else
        puts "No changes to commit for #{project}"
      end
    end

There are many improvements left to be made with this script, but so far I’m very happy with the result. Using classes and objects is a breath of fresh air compared to procedural bash, and having this rolled into a gem makes it very easy to share with the team.

Part of the process of migrating new customers to Acquia Hosting involves adding (or verifying the presence of) three Drupal modules:

Manual?! Awe shucks…

Verifying, adding, and committing these modules manually generally takes about five to ten minutes and can be error-prone. I don’t usually stand a site up for this task, but just clone the repo locally, download the modules and move them into place with rsync. This means I can’t lean on Drupal to make the right decisions for me. Mistakes are not a huge deal at this phase, but can add many minutes to an otherwise quick task (assuming we actually catch the mistake!). Mistakes might include adding D7 modules to a D6 site, putting modules in the wrong location, or adding a slightly older version of a module (perhaps with a known security flaw!). Once a mistake has been introduced, we now have to verify the mistake, maybe perform an interactive Git rebase on bad commits, and generally do more work.

In order to ease some of the human error factor of the above scenario, and since this is repetitive and script-worthy, I decided to cobble together a bash script to automate the process. Now the whole task is much less error-prone, and takes all of 5-10 seconds to complete!

The Brainstorm

Below is the basic plan I brainstormed for how I initially thought the script should operate:

get drupal version from prompt
check if acquia_connector, fast_404, memcache already exist in the repo
check contrib modules path - contrib|community
download modules that don't exist and move into place
git add and commit each downloaded module individually
git push to origin

You’ll notice that not all of the above was actually implemented/needed, but it gave a good starting point for setting up the basic mechanics of the script, and served as an anchor when I needed to reset my focus.

Gotta drush That

To simplify the process of downloading and adding the latest version of each module for the correct version of Drupal core, I decided to lean on drush, and particularly drush’s ability to build out a codebase via the make command.

A few important points:

  • in Drupal 67, shared contributed modules are generally located at ‘sites/all/modules[/contrib]’
  • drush make receives build instructions via a make file
  • since each project is evaluated individually, we need a make file for each project
  • since make files are static, we need a different set of make files for each version of Drupal, and for each contrib module path

Looking back through the repo history, you’ll see that my initial approach was to generate static make files for each Drupal version, project, and project path. You’ll also see that I included a secondary script to generate a new set of make files for those rare times when a codebase is using a contrib path such as ‘sites/all/modules/community’ or other. Fortunately, there is a better way!

A Better Way

In bash, we can define dynamic make files as heredocs. By making this shift, I was able to trim 12+ static make files along with secondary bash script down to two heredocs:

function makefile() {
  if [[ $3 == 'modules' ]]; then
    cat <<EOF
core = $1.x
api = 2

; Modules
projects[] = $2
EOF
  else
    cat <<EOF
core = $1.x
api = 2

; Modules
projects[$2][subdir] = $3
EOF
  fi
}

In order to support the shift to heredocs, I also had to convert the drush command from referencing static files to processing make files via STDIN. Thanks to this comment, I ended up with the following:

for i in "${DIFF[@]}"; do
  makefile $VERSION $i $CONTRIB \
    | $DRUSH make --no-core -y php://stdin
done

And with that, we have a powerful and dynamic bash script that will save lots of time, and can be easily expanded or improved to handle additional modules and use cases. I also set the repo up to be a collection of helpful scripts, and I very much look forward to automating away additional complexities.

Below is a Github Gist with a map comprised of GeoJSON data, along with the bash command that was used to create it.

Github automatically generates maps for GeoJSON files in Gists and repositories. This is well documented at the following article: https://help.github.com/articles/mapping-geojson-files-on-github

The map below pulls data from eBird API, transforms the data structure into GeoJSON format using jq, and then POSTs to a Github Gist.

Previously, I covered the use of the git filter-branch for removing large media assets from a repository’s history.

Recently I had a new opportunity to perform this task on whole directories. Here is the command sequence I used to clean up the repo history and to shrink the pack file:

# Remove a directory from history in all branches and tags
$ git filter-branch --index-filter 'git rm -r --cached --ignore-unmatch path/to/dir' --tag-name-filter cat -- --all

# Shrink the local pack
$ rm -Rf .git/refs/original
$ rm -Rf .git/logs
$ git gc
$ git prune --expire now

# Push up changes to the remote
$ git push origin --force --all
$ git push origin --force --tags

Additionally, it turns out I only had a partial list of assets to remove from the repo. I was able to track down the rest with a couple of additional Git commands:

# Show the 20 largest items in the pack
$ git verify-pack -v .git/objects/pack/pack-{hash}.idx | sort -k 3 -n | tail -n 20

# View a pack object
$ git rev-list --objects --all | grep {hash}
{hash} path/to/pack/object.ext

Rinse and repeat.