Permanently remove files and folders from a git repository

Few weeks ago I froze gems on my blog project and ended up with a very big repository. So, I wanted to clean up the mess and remove permanently gems folder from the repository. "git rm" isn't doing the job well, it only removes the folder from the working tree and the repository still contains the objects of this folder. After a quick search, I found that git-filter-branch was the command I was looking for.

So, you can permanently remove a folder from a git repository with:

git filter-branch --tree-filter 'rm -rf vendor/gems' HEAD

Which will go through the whole commits history in the repository, one by one change the commit objects and rewrite the entire tree.

We used -r (recursive) parameter for recursive remove, and -f (force) to ignore nonexistent files (since folder/files may not be introduced to the repository within the commits range on which we do branch filter).

You can also specify range between commits, where you like to filter:

git filter-branch --tree-filter 'rm -rf vendor/gems' 7b3072c..HEAD

First commit is not being filtered.

If you subsequently try to do branch filters, you should provide -f option to filter-branch to overwrite the backup in refs/original/ where git stores the original refs from the previous branch filter.

git filter-branch -f --tree-filter 'rm -rf vendor/gems' HEAD

You can also remove original refs by hand, or do some backup to other location.

rm -rf .git/refs/original/

Permanently removing files from repository is same as folders:

git filter-branch --tree-filter 'rm filename' HEAD

There are few branch filter types (you can check the documentation), but the one we use here --tree-filter is for rewriting the tree and its contents. You can also use --index-filter which is similar to --tree-filter but does not check the tree, and it goes much faster.

git filter-branch --index-filter 'git rm --cached --ignore-unmatch filename' HEAD

--ignore-unmatch parameter is used to ignore nonexistent files.

At the end, don't forget to push the changes to the repository with --force, since this is not a fast forward commit, and the whole history within the commits range we filtered should be rewritten.

git push origin master --force