Introduction
Czkawka (Polish for "hiccup") is a cross-platform utility that identifies wasted disk space and problematic files. It provides both a GTK4 GUI and a CLI, scanning for duplicates, empty files, similar images, broken symlinks, and more while running significantly faster than alternatives due to its Rust implementation.
What Czkawka Does
- Detects duplicate files using hash-based comparison with size pre-filtering for speed
- Finds visually similar images using perceptual hashing algorithms
- Identifies empty files, empty directories, and temporary files for cleanup
- Locates broken symbolic links and invalid file extensions
- Scans for large files consuming disproportionate storage
Architecture Overview
Czkawka is built in Rust using a multi-threaded scanning pipeline. Files are first filtered by size, then grouped by partial hashes, and finally confirmed via full hash comparison. The image similarity module uses perceptual hash algorithms (Gradient, Mean, DCT) to detect near-duplicate photos. The project ships as separate crates: a core library, CLI binary, and GTK4 GUI application.
Self-Hosting & Configuration
- Install via package managers:
sudo apt install czkawkaon Debian/Ubuntu,flatpak install czkawkaelsewhere - CLI supports scripting with machine-parseable output for automated cleanup pipelines
- Exclusion patterns let you skip directories like
.git,node_modules, or system paths - Reference directories allow marking certain folders as read-only sources for comparison
- Results can be exported to files for review before deletion
Key Features
- Written in Rust with multi-threaded scanning, achieving speeds many times faster than similar tools
- Cross-platform support for Linux, macOS, and Windows with both GUI and CLI interfaces
- Perceptual image similarity detection finds near-duplicates even with different resolutions or formats
- Minimal dependencies and low resource usage compared to tools like FSlint or dupeGuru
- Active development with regular releases and community contributions
Comparison with Similar Tools
- fdupes — CLI-only, single-threaded, lacks image similarity; Czkawka is multi-threaded with more scan modes
- dupeGuru — Python-based GUI tool that is slower and heavier; Czkawka offers native performance
- rmlint — fast C-based deduplicator but lacks GUI and image similarity features
- FSlint — unmaintained Python/GTK2 tool; Czkawka is the modern actively-maintained replacement
FAQ
Q: Does Czkawka permanently delete files immediately? A: No. By default it moves files to the system trash. You can also choose to permanently delete or create hardlinks instead.
Q: How does image similarity work? A: Czkawka generates perceptual hashes of images and compares them. You can adjust the similarity threshold to control how strict matching is.
Q: Can I run Czkawka as part of an automated cleanup script? A: Yes. The CLI version supports all scan modes with flags for output format and automatic deletion, suitable for cron jobs.
Q: Is Czkawka safe to use on system directories? A: Use exclusion lists to protect critical system paths. The tool never modifies files without explicit user confirmation.