More Fun with 'git filter-branch'

Reading time ~1 minute

Previously, I covered the use of the git filter-branch for removing large media assets from a repository’s history.

Recently I had a new opportunity to perform this task on whole directories. Here is the command sequence I used to clean up the repo history and to shrink the pack file:

# Remove a directory from history in all branches and tags
$ git filter-branch --index-filter 'git rm -r --cached --ignore-unmatch path/to/dir' --tag-name-filter cat -- --all

# Shrink the local pack
$ rm -Rf .git/refs/original
$ rm -Rf .git/logs
$ git gc
$ git prune --expire now

# Push up changes to the remote
$ git push origin --force --all
$ git push origin --force --tags

Additionally, it turns out I only had a partial list of assets to remove from the repo. I was able to track down the rest with a couple of additional Git commands:

# Show the 20 largest items in the pack
$ git verify-pack -v .git/objects/pack/pack-{hash}.idx | sort -k 3 -n | tail -n 20

# View a pack object
$ git rev-list --objects --all | grep {hash}
{hash} path/to/pack/object.ext

Rinse and repeat.

comments powered by Disqus