If this project helps your work, support ongoing maintenance and new features.
ETH Donation Wallet
0x11282eE5726B3370c8B480e321b3B2aA13686582
Scan the QR code or copy the wallet address above.
A Rust CLI tool that recursively discovers Git repositories, captures state changes, generates diffs, extracts code elements with full snippets, and produces security-focused reports for code review and audit workflows.
- Repository Discovery: Recursively scan directories for Git repos with configurable filters
- State Tracking: Capture pre/post-pull state with commit hashes, messages, and dirty detection
- Diff Generation: Automatic N vs N-1 and historical diff creation with file manifests
- Element Extraction: Parse diffs to identify functions, structs, classes, imports, and more across 10+ languages
- Code Snippets: Extract full before/after code with boundary detection and context windows
- Security Tagging: 18 built-in security patterns (crypto, auth, secrets, SQL injection, XSS, etc.)
- Multi-Format Reports: JSON, Markdown, text, and SARIF outputs with cross-repo security overview
- Branch-Diff Mode: Diff any two refs (branches, tags, commits) in a single repo β ideal for PR reviews
- Performance: Parallel processing with progress bars, LRU caching, and incremental mode
Why not just use bash?
A one-liner like
ls | while read line; do git -C "$line" diff HEAD~1 HEAD || true; doneonly shows raw diffs. DiffCatcher adds recursive discovery, code element extraction, security pattern detection, SARIF output for CI/CD, parallel processing, and cross-repo security aggregation. See full comparison below.
- Installation
- Quick Start
- Usage
- Report Structure
- Configuration
- Architecture
- Testing
- Documentation
- Contributing
git clone https://github.com/Teycir/DiffCatcher.git cd DiffCatcher cargo build --release ./target/release/diffcatcher --help- Rust 1.70+
- Git 2.0+
# Scan all repos in a directory (fetch-only, no modifications) diffcatcher ~/projects # Pull updates and generate security report diffcatcher ~/projects --pull -o ./report # Diff two branches in a single repo (PR review mode) diffcatcher ./my-repo --diff main..feature/auth -o ./pr-report # Generate SARIF output for GitHub Code Scanning diffcatcher ~/projects --summary-format sarif,json -o ./report # Dry run to see what would be scanned diffcatcher ~/projects --dry-run # Fast scan with 8 parallel workers diffcatcher ~/projects -j 8 --quiet# Scan with default settings (fetch-only) diffcatcher <ROOT_DIR> # Custom output directory diffcatcher ~/projects -o ./my-report # Include nested repos and follow symlinks diffcatcher ~/projects --nested --follow-symlinks # Skip hidden directories diffcatcher ~/projects --skip-hidden# Fetch only (default - no working tree changes) diffcatcher ~/projects # Actually pull changes diffcatcher ~/projects --pull # Force pull with stash/pop for dirty repos diffcatcher ~/projects --pull --force-pull # Use rebase strategy diffcatcher ~/projects --pull --pull-strategy rebase # Skip fetch/pull entirely (historical diffs only) diffcatcher ~/projects --no-pull# Skip element extraction (raw diffs only) diffcatcher ~/projects --no-summary-extraction # Extract elements but skip code snippets diffcatcher ~/projects --no-snippets # Adjust snippet context and limits diffcatcher ~/projects --snippet-context 10 --max-snippet-lines 300 # Limit elements per diff diffcatcher ~/projects --max-elements 1000# Skip security tagging diffcatcher ~/projects --no-security-tags # Include test files in security analysis diffcatcher ~/projects --include-test-security # Use custom security patterns diffcatcher ~/projects --security-tags-file ./custom-patterns.jsonDiffCatcher can auto-load project-local configuration from:
<ROOT_DIR>/.diffcatcher.toml(default)- a custom file via
--config <FILE> - disabled with
--no-config
Example:
output = "reports-local" no_pull = true history_depth = 2 summary_formats = ["json", "txt"] no_security_tags = false [plugins] security_pattern_files = ["plugins/security-extra.json"] extractor_files = ["plugins/extractors.json"]CLI flags still override config values when explicitly set.
DiffCatcher supports two plugin types:
- Security pattern plugins via
--security-plugin-file <FILE>(repeatable) - Extractor plugins via
--extractor-plugin-file <FILE>(repeatable)
Security plugin format matches --security-tags-file JSON (version, mode, tags).
Extractor plugin format:
{ "version": 1, "extractors": [ { "name": "policy-rule", "kind": "Config", "regex": "^policy\\s+([A-Za-z_][A-Za-z0-9_]*)" } ] }# Diff two branches in a single repo diffcatcher ./my-repo --diff main..feature/auth # Diff specific commits diffcatcher ./my-repo --diff abc123..def456 # Diff with SARIF output for CI integration diffcatcher ./my-repo --diff origin/main..HEAD --summary-format sarif -o ./pr-reportThe --diff BASE..HEAD flag skips repository discovery and fetch/pull β it directly diffs two refs (branches, tags, or commit SHAs) and runs the full extraction + security tagging pipeline on the result.
# Generate SARIF alongside other formats diffcatcher ~/projects --summary-format sarif,json,md # SARIF-only for CI/CD upload diffcatcher ~/projects --summary-format sarif -o ./reportWhen sarif is included in --summary-format, a results.sarif file is written to the report root. This file follows the SARIF 2.1.0 standard and integrates with GitHub Code Scanning, VS Code SARIF Viewer, Azure DevOps, and other SARIF-compatible tools.
# Incremental mode (skip unchanged repos) diffcatcher ~/projects --incremental -o ./report # Filter by branch pattern diffcatcher ~/projects --branch-filter "main" # Adjust history depth diffcatcher ~/projects --history-depth 5 # JSON output for CI/CD diffcatcher ~/projects --quiet --json > result.json # Verbose output with discovered paths diffcatcher ~/projects --verbose<report_dir>/ βββ summary.json # Global summary βββ summary.md # Markdown summary βββ results.sarif # SARIF 2.1.0 output (when --summary-format sarif) βββ security_overview.json # Cross-repo security aggregation βββ security_overview.md βββ <repo-name>/ β βββ status.json # Repo state β βββ pull_log.txt β βββ diffs/ β βββ diff_N_vs_N-1.patch # Raw unified diff β βββ changes_N_vs_N-1.txt # File manifest β βββ summary_N_vs_N-1.json # Element extraction β βββ summary_N_vs_N-1.md β βββ snippets/ β βββ 001_validate_token_ADDED.rs β βββ 002_check_permissions_BEFORE.rs β βββ 002_check_permissions_AFTER.rs β βββ 002_check_permissions.diff βββ ... | Flag | Default | Description |
|---|---|---|
-o, --output | ./reports/<timestamp> | Report output directory |
-j, --parallel | 4 | Concurrent repo processing |
-t, --timeout | 120 | Git operation timeout (seconds) |
-d, --history-depth | 2 | Historical commits to diff |
--snippet-context | 5 | Context lines around changes |
--max-snippet-lines | 200 | Max lines per snippet |
--max-elements | 500 | Max elements per diff |
--diff | β | Diff two refs in a single repo (BASE..HEAD) |
--summary-format | json,md | Output formats: json, md, txt, sarif |
See diffcatcher --help for all options.
Create a JSON file with custom patterns:
{ "version": 1, "mode": "extend", "tags": [ { "tag": "pii-handling", "description": "PII data processing", "severity": "High", "patterns": ["ssn", "social_security", "passport"] } ] }Use with --security-tags-file ./patterns.json
src/ βββ cli.rs # Argument parsing βββ scanner.rs # Repository discovery βββ git/ # Git operations β βββ commands.rs # Git wrappers β βββ state.rs # State capture β βββ diff.rs # Diff generation β βββ file_retrieval.rs βββ extraction/ # Element extraction β βββ parser.rs # Unified diff parser β βββ elements.rs # Element detection β βββ snippets.rs # Code snippet extraction β βββ boundary.rs # Bracket/indentation tracking β βββ languages/ # Language-specific patterns βββ security/ # Security tagging β βββ tagger.rs # Pattern matching β βββ patterns.rs # Built-in patterns β βββ overview.rs # Cross-repo aggregation βββ report/ # Report generation βββ writer.rs # Directory structure βββ json.rs # JSON serialization βββ sarif.rs # SARIF 2.1.0 output βββ markdown.rs # Markdown formatting βββ snippet_writer.rs A simple bash one-liner can list diffs:
ls | while read line; do git -C "$line" diff HEAD~1 HEAD || true; doneThis works for quick checks, but DiffCatcher adds significant capabilities:
| Capability | Bash One-Liner | DiffCatcher |
|---|---|---|
| Recursive discovery | Top-level items only | Nested repos, symlinks, filters |
| State tracking | None | Commit hashes, dirty detection, pull logs |
| Code understanding | Raw diff only | Extracts functions/structs/classes across 10+ languages |
| Code snippets | None | Full before/after with context windows |
| Security analysis | None | 18 built-in patterns (auth, crypto, secrets, SQLi, XSS) |
| Output formats | Terminal only | JSON, Markdown, SARIF (GitHub Code Scanning) |
| Cross-repo view | Per-repo only | Aggregated security report across all repos |
| Performance | Sequential | Parallel workers, LRU caching, incremental mode |
| CI/CD integration | None | SARIF upload to GitHub/Azure DevOps |
| Error handling | ` | |
| Path handling | Fails on spaces | Handles all path names correctly |
| Historical context | Fixed HEAD~1 | Configurable depth, state tracking |
The bash one-liner is ~100 bytes. DiffCatcher is a security-focused audit tool with full code element extraction.
# Run all tests cargo test # Run specific test suite cargo test security_tagger # Run with output cargo test -- --nocaptureTest coverage includes:
- Unit tests for extraction, security tagging, boundary detection
- Integration tests for state capture, diff generation, reports
- Golden-file tests for extraction accuracy
- Edge case tests (detached HEAD, bare repos, single-commit)
# Compile benchmark binaries cargo bench --no-run # Run benchmark harness cargo bench --bench core_benchBenchmark source lives in benches/core_bench.rs and tracks parser/extraction throughput.
GitHub Actions workflows are included:
.github/workflows/ci.yml: format check, clippy, tests, bench build.github/workflows/release.yml: tag-based release packaging and GitHub release publishing
- Plan.md - Full specification (v1.2)
- Roadmap.md - Implementation roadmap and progress
- Security patterns reference (see
src/security/patterns.rs)
All modules include comprehensive inline documentation. Key modules:
src/extraction/parser.rs- Unified diff parser with hunk extractionsrc/extraction/elements.rs- Language-aware code element detectionsrc/extraction/snippets.rs- Full code snippet extraction with boundary detectionsrc/security/tagger.rs- Security pattern matching enginesrc/git/commands.rs- Git operation wrappers
Generate full API docs:
cargo doc --open#rust #git #security #code-review #diff-analysis #static-analysis #devops #cli-tool #audit #vulnerability-detection #code-quality #snippet-extraction #parallel-processing #security-scanning
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure
cargo testpasses - Submit a pull request
MIT License - see LICENSE file for details
- Author: Teycir Ben Soltane
- Email: teycir@pxdmail.net
- Website: teycirbensoltane.tn