Cleanup Old Branches in Github

04-14-2020

Git and GitHub are tools I use a lot for my day to day work and for training sessions I lead. Often I will allow participants in my training sessions to create and push code to a new branch in GitHub. Maybe for a code review, or to use a CI/CD pipeline I have setup. Pretty soon I have a lot of old branches that I need to cleanup. Here is a bash script I built, with the help from some friends on StackOverflow, to cleanup any branches that are older than a period of time, for example a month. The script will find all branches that meet the criteria and remove them from the remote server. This helps to keep my remote nice and clean.

The script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#!/bin/bash

# TO USE: ./cleanup-old-repos.sh --DRY_RUN < [true] | false > --DIRECTORY < relative path to directory with GIT repo and remote >
# Params:
# - --DRY_RUN, Boolean, test the results (true) or execute commands to cleanup old repos (false), default: true
# - --DIRECTORY, String, path to repo to cleanup, default: ""

# EX Use: ./cleanup-old-repos.sh --DRY_RUN false --DIRECTORY my-application/
DRY_RUN=${DRY_RUN:-true}
DIRECTORY=${DIRECTORY:-""}
DELAY_BEFORE="1 month ago"

# Set Default values
ECHO='echo '

# Assign arguments to their values
while [ $# -gt 0 ]; do

if [[ $1 == *"--"* ]]; then
param="${1/--/}"
declare $param="$2"
fi

shift
done

# Check if $DIRECTORY is assigned
if [ -z "$DIRECTORY" ]; then
echo "No directory supplied"
exit 1;
fi

# Change to directory containing a .git directory and remote
cd $DIRECTORY

# Perform Cleanup of branches if DRY_RUN = false
for branch in $(git branch -a | sed 's/^\s*//' | sed 's/^remotes\///' | grep -v 'master$'); do
if [[ "$(git log $branch --since "$DELAY_BEFORE" | wc -l)" -eq 0 ]]; then
if [[ "$DRY_RUN" = "false" ]]; then
ECHO=""
fi
local_branch_name=$(echo "$branch" | sed 's/remotes\/origin\///')
$ECHO $local_branch_name
$ECHO git branch -d $local_branch_name
$ECHO git push origin --delete $local_branch_name
fi
done

Let’s break this done.

At the top of the file I’m defining variables I will need throughout the script, nothing complex here, I am defining default values for anything the user doesn’t define. You’ll notice the DRY_RUN variable. For any script that’s taking a destructive action, i.e. deleting something, I think it’s best practice to have the ability to know what the script is going to do before it actually does anything. We’ll use the DRY_RUN flag a little later.

1
2
3
4
5
6
7
DRY_RUN=${DRY_RUN:-true}
DIRECTORY=${DIRECTORY:-""}
DELAY_BEFORE="1 month ago" # Value provided by GIT

# Set Default values
ECHO='echo '
...

Next, I’m assigning values to named parameters provided my the user.

1
2
3
4
5
6
7
8
9
10
11
12
...
# Assign arguments to their values
while [ $# -gt 0 ]; do

if [[ $1 == *"--"* ]]; then
param="${1/--/}"
declare $param="$2"
fi

shift
done
...

I like to do this for simplicity and a better user experience for the developer using the script. The developer can assign a value to a variable anywhere in the command, following the script command. For example: --DIRECTORY my-application/ will assign the DIRECTORY variable to the local directory my-application/.

Following a conditional check to make sure we have the information we need the bulk of the cleanup happens here:

1
2
3
4
5
6
7
8
9
10
11
12
13
...
# Perform Cleanup of branches if DRY_RUN = false
for branch in $(git branch -a | sed 's/^\s*//' | sed 's/^remotes\///' | grep -v 'master$'); do
if [[ "$(git log $branch --since "$DELAY_BEFORE" | wc -l)" -eq 0 ]]; then
if [[ "$DRY_RUN" = "false" ]]; then
ECHO=""
fi
local_branch_name=$(echo "$branch" | sed 's/remotes\/origin\///')
$ECHO $local_branch_name
$ECHO git branch -d $local_branch_name
$ECHO git push origin --delete $local_branch_name
fi
done

You’ll notice the DRY_RUN flag is used here. If DRY_RUN is set to true (not the default value, you have to want it), then the script will loop through all branches that are older than a certain amount, and not equal to “master” and will remove the branch from our local copy of the repository and the remote repository. We can extend the branch names that we want to “protect” from the script by adding them to the list here:

1
2
3
...
for branch in $(git branch -a | sed 's/^\s*//' | sed 's/^remotes\///' | grep -v 'master$'); do
...

This script has become incredibly valuable for maintaining a clean set of repositories for all of my projects.