Introduction
Git Large File Storage (LFS) solves the problem of versioning binary and large files in git. Instead of storing full file contents in the repository history, LFS replaces them with lightweight pointer files and stores the actual data on a separate server. This keeps your repository fast to clone and pull.
What Git LFS Does
- Replaces large files in your working tree with small pointer files tracked by git
- Stores actual file content on a configurable LFS server (GitHub, GitLab, or self-hosted)
- Downloads only the large files you need for your current checkout, not the entire history
- Integrates transparently with
git push,git pull, andgit clone - Supports file locking to prevent merge conflicts on binary assets
Architecture Overview
LFS works as a git smudge/clean filter. On commit, the clean filter replaces file content with a pointer containing a SHA-256 hash. On checkout, the smudge filter downloads the real content from the LFS server using the Git LFS API (an HTTP-based protocol). The LFS client binary handles all server communication and local caching.
Self-Hosting & Configuration
- Install the
git-lfsbinary via your OS package manager or from the official releases - Run
git lfs installonce to configure global git filters - Use
.gitattributesto declare which file patterns LFS should manage - Self-host an LFS server with projects like lfs-test-server, Gitea, or GitLab built-in LFS
- Configure the LFS endpoint per-repo via
.lfsconfigorgit config
Key Features
- Keeps repository size small even with gigabytes of binary assets
- File locking prevents two people from editing the same binary simultaneously
- Supports partial clone and sparse checkout for large monorepos
- Works with GitHub, GitLab, Bitbucket, Gitea, and custom LFS servers
- Migrate existing large files into LFS with
git lfs migrate
Comparison with Similar Tools
- git-annex — decentralized large file management with more flexibility but steeper learning curve
- DVC (Data Version Control) — designed for ML pipelines with dataset versioning; LFS is simpler for general binary assets
- Perforce Helix — proprietary VCS built for large files; LFS brings similar capability to git
- Git submodules — separate repos for large assets; LFS keeps everything in one repo
FAQ
Q: Is there a file size limit? A: Git LFS itself has no hard limit. GitHub LFS allows files up to 2 GB; self-hosted servers set their own limits.
Q: Does LFS increase clone time? A: It typically reduces clone time because only pointer files are in the git history. Actual content is fetched on demand.
Q: Can I stop tracking a file with LFS?
A: Yes. Use git lfs untrack "pattern" and then git lfs migrate export to move files back into regular git.
Q: What happens if the LFS server is down? A: You can still commit pointer files. The actual content download will fail until the server is reachable again.