Skip to main content

Git hooks: How to automate actions in your Git repo

Protect your Git repository from mistakes, automate manual processes, gather data about Git activity, and much more with Git hooks.

Photo by Rachel Claire from Pexels

If you administer a Git server, you know that lots of unexpected tasks come up over the lifecycle of a repository. Contributors commit to the wrong branch, a project manager might want to implement an approval process, developers may need a specific review process, you might call for certain actions to be taken after a successful push. There are lots of little convenience features that Git can provide, but to take advantage of them, you need to learn about Git hooks.

Git hooks are shell scripts found in the hidden .git/hooks directory of a Git repository. These scripts trigger actions in response to specific events, so they can help you automate your development lifecycle.

Although you may never have noticed them, every Git repository includes 12 sample scripts. Because they're shell scripts, they're extremely flexible, and there are even some Git-specific data you have access to within a Git repository.

Create a Git repository

To get started, create a sample Git repository:

$ mkdir example
$ cd !$
$ git init

Take a look at the .git/hooks directory to see some default scripts:

$ ls -1 .git/hooks/
applypatch-msg.sample
commit-msg.sample
fsmonitor-watchman.sample
post-update.sample
pre-applypatch.sample
pre-commit.sample
pre-merge-commit.sample
prepare-commit-msg.sample
pre-push.sample
pre-rebase.sample
pre-receive.sample
update.sample

The sample Git hooks included in your new repo indicate common triggers available to you. For instance, when enabled, the pre-commit.sample executes when a commit is submitted but before it is permitted, and the commit-msg.sample script executes after a commit message has been submitted.

Write a simple Git hook

Do you want to put in guardrails to prevent mistakes when making commits to your Git repository? A simple Git hook trick is to prompt the user for confirmation before they commit something to a branch.

Create a new file named .git/hooks/pre-commit and open it in a text editor. Add the following text, which queries Git for a list of the files about to be committed for the current branch name and then enters a while loop until it gets a response from the user:

#!/bin/sh

echo "You are about to commit" $(git diff --cached --name-only --diff-filter=ACM)
echo "to" $(git branch --show-current)

while : ; do
    read -p "Do you really want to do this? [y/n] " RESPONSE < /dev/tty
    case "${RESPONSE}" in
        [Yy]* ) exit 0; break;;
        [Nn]* ) exit 1;;
    esac
done

Mark the file executable:

$ chmod +x .git/hooks/pre-commit

And then try it out by creating, adding, and committing a file:

$ echo "hello git hooks" > hello.txt
$ git add hello.txt
$ git commit -m 'but warn me first'
You are about to commit hello.txt
to main
Do you really want to do this? [y/n] y
[main 125993f] but warn me first
 1 files changed, 1 insertion(+)
 create mode 100644 hello.txt

You can test it a second time to ensure that it lets you decline a commit.

[ For more automation tips, download the eBook 5 steps to automate your business. ]

Check commits for binary data

Some binary data in a repository is generally acceptable, but there's so much binary data on some projects that it would weigh a repository's actions down if it were all committed. I use Git-portal to help manage this, but I also have a pre-commit hook to double-check that no binary blobs are making it into a commit:

#!/usr/bin/env bash

shopt -s nullglob
declare -a FILES
declare -a HIT

echo "Git hook executing: pre-commit..."

# dump staged filenames into FILES array
FILES=(`git diff --cached --name-only --diff-filter=ACM`)

n=0
for i in "${FILES[@]}"; do
    WARN=`file --mime "${i}" | grep -i binary`
    NAME=`file "${i}" | cut -d":" -f1`

    if [ -n "${WARN}" ]; then
        HIT[$n]="${NAME}"
        WARN=""
        echo "${NAME} appears to be a binary blob."
        exit 1
    elif [[ "${NAME}" == *"blah" ]]; then
        true
        # do some stuff here
    else
        true
        # do some other stuff here
    fi
    let "n++"
done

if [ ${#HIT[@]} -gt 0 ]; then
    echo "  WARNING: Binary data found"
fi

The script I use is longer than this and has allowances for Git-portal integration, but this small sample is fully functional and gives you a good idea of the logic. The script reviews each staged file (obtained with a simple git diff command), uses the file command to determine whether it's binary or not, and then takes action accordingly.

An interesting quirk of empty files is that they are treated as binary objects:

$ touch quux
$ file --mime quux
quux9: inode/x-empty; charset=binary

If you're testing this script, make sure you know the mimetype of what you're testing.

Push hooks

Not everything has to happen at commit time. The act of pushing to a repository is also a valid trigger. A pre-push hook is called with parameters, so for some values, you don't have to query Git the way you might for a pre-commit hook. The parameters are:

  • $1 = The name of the remote being pushed to
  • $2 = The URL of the remote

If you push without using a named remote, those arguments are equal.

Information about the commits being pushed is provided as lines to the standard input in this format:

<local ref> <local sha1> <remote ref> <remote sha1>

I use a pre-push hook to synchronize offline storage with Git pushes. When a user pushes a commit, their local stash of binary blobs (3D models, 4K images, and other large artifacts too large for the Git repo) are copied to the remote storage mirror. This script does that:

#!/usr/bin/env bash
shopt -s nullglob
declare -a HIT
declare -a STORAGE

function populate() {
    local n=0
    for i in "${HIT[@]}"; do	
	STORAGE[$n]=`git ls-remote --get-url "${i}"`
	let "n++"
    done
}

function portalsync() {
    for i in "${STORAGE[@]}"; do
	rsync --rsh=ssh -av --exclude-from="${TOP}/.portalexclude" --progress "${TOP}"/"${PORTAL}" "${i}" || echo "rsync failed."
	echo "Syncing _portal content to ${i}"
    done
}

while read local_ref local_sha remote_ref remote_sha
do
    REF=$remote_ref
done

if [ "${REF}" = "refs/heads/master" ]; then
    echo "master destination detected"

    TOP=`git rev-parse --show-toplevel || false`
    PORTAL=_img

    # dump portal remote URIs into STORAGE array
    HIT=(`git remote | grep "_portal*"`)

    if [ ${#HIT[@]} -gt 0 ]; then
	populate
	echo "Syncing _portal content..."
	portalsync
    fi
fi

Despite the apparent verbosity of the script, it's actually pretty simple because it uses the built-in variables provided by the way Git calls pre-push. The one thing that Git doesn't provide automatically is the top level of the repo, which can make referring to file paths tricky. My hack for this is simple:

TOP=`git rev-parse --show-toplevel || false`

This creates the variable $TOP to represent the outermost bounds of the Git repository directory.

Functionally, you now can set absolute paths starting at $TOP.

Git hooks

It's important to note that Git hooks aren't committed to a Git repository themselves. They're local, untracked files. When you write an important hook that you want to keep around, copy it into a directory managed by Git!

Git hooks are an important aspect of Git that is too often forgotten for being hidden away. Although only 12 are bundled as samples in a repository, there are many more kinds of hooks you can use, so use the man githooks command for details on the kinds of triggers available. Once you feel comfortable using Git hooks, they can protect your Git repository from silly mistakes, automate manual processes, gather data about Git activity, and do much more.

Topics:   Git   Linux administration  
Author’s photo

Seth Kenlon

Seth Kenlon is a UNIX geek and free software enthusiast. More about me

Try Red Hat Enterprise Linux

Download it at no charge from the Red Hat Developer program.