Introduction
BFG Repo-Cleaner is a specialized tool for removing unwanted data from git history. Written in Scala, it focuses on two common tasks: deleting large files that bloat your repository and scrubbing sensitive strings like passwords or API keys. It runs significantly faster than git-filter-branch while being simpler to use.
What BFG Does
- Removes files exceeding a specified size from every commit in history
- Replaces sensitive text (passwords, tokens) with REMOVED across all commits
- Deletes specific files by name from the entire history
- Protects the current HEAD commit by default so your latest code is never modified
- Produces a detailed report showing exactly what was changed and where
Architecture Overview
BFG reads the git object database directly using JGit (a pure Java git implementation) rather than checking out each commit. It processes objects in parallel, which makes it 10-720x faster than git-filter-branch on typical repositories. The tool outputs a new set of cleaned git objects and rewrites the branch references to point to them.
Self-Hosting & Configuration
- Requires Java 8 or later; download the single JAR file from Maven Central
- Always operate on a bare mirror clone to preserve the original as a backup
- Create a text file with one secret per line for
--replace-textoperations - Run
git gcafter cleaning to physically remove the old objects - Force push to update the remote, then have all collaborators re-clone
Key Features
- Single JAR file with no installation step beyond having Java
- Protects your latest commit by default to prevent accidental data loss
- Handles repositories with millions of commits efficiently via parallel processing
- Can target specific branches or process the entire history at once
- Clear human-readable output showing every file and text replacement made
Comparison with Similar Tools
- git-filter-repo — broader rewriting capabilities (path renames, commit editing); BFG is simpler for the specific tasks of removing files and secrets
- git-filter-branch — the legacy built-in tool; dramatically slower and more error-prone
- GitHub secret scanning — detects secrets but does not remove them from history; BFG actually rewrites the commits
- TruffleHog / Gitleaks — find secrets in history; use BFG afterward to remove what they discover
FAQ
Q: Will BFG modify my latest commit?
A: No. By default it protects HEAD. Use --no-blob-protection to override this, but do so with caution.
Q: Can I undo a BFG run?
A: If you kept the original bare clone, yes. Otherwise, the old history is gone after git gc and force push.
Q: Does BFG work with Git LFS? A: Yes. You can use BFG to migrate large files into LFS pointers across your entire history.
Q: How fast is it compared to filter-branch? A: On a real repository with 120,000 commits, BFG completed in 2 minutes versus 24 hours for filter-branch.