Skip to content

A113L/rulest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

100 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ” GPU-Accelerated Hashcat Rule Extractor

Extract and chain Hashcat-compatible rules from wordlists using OpenCL GPU acceleration.


πŸ“‹ Table of Contents


πŸ” Overview

This toolkit analyzes two wordlists β€” a base (source) wordlist and a target (dictionary) wordlist β€” and reverse-engineers the Hashcat rules that transform words from the base into words in the target. Rules are discovered via GPU-parallel transformation and validated for direct compatibility with Hashcat's GPU engine.

The result is a .rule file you can load directly into Hashcat (-r rules.txt), ordered by effectiveness (hit count).


πŸ“ Scripts

rulest.py β€” v1 (BFS, Legacy)

A first-generation implementation using a Breadth-First Search (BFS) chaining strategy executed on the GPU via a monolithic OpenCL kernel.

Approach:

  • Generates a static, hard-coded rule set (simple rules, T/D positional, s-substitution, Group A)
  • Chains rules across depths using temporary disk files to pass state between BFS layers
  • No rule validation against Hashcat's GPU compatibility specification
  • No Bloom filter β€” lookups performed directly against a Python set
  • Single device selection (first available platform/device)
  • No hit counting or frequency-based ranking
  • Fixed batch size; halves on MemoryError

When to use: Historical reference only. v2 is strictly superior in every dimension.


rulest_v2.py β€” v2 (Recommended)

A complete redesign built around GPU efficiency, Hashcat compatibility, and intelligent search strategy.

Key capabilities:

  • βœ… Full Hashcat GPU rule validation (max 255 ops, correct argument types)
  • βœ… Bloom filter on-GPU for fast membership testing with configurable false-positive rate
  • βœ… Two-phase extraction: single-rule sweep β†’ informed chain generation
  • βœ… Dynamic VRAM-aware batch and budget sizing (scales with available VRAM; baseline 8 GB)
  • βœ… Hot-rule biased chain generation using Phase 1 results (60% hot-rule bias, configurable via HOT_RULE_RATIO)
  • βœ… Seed rules support to guide chain exploration (30% budget allocated to extending seeds)
  • βœ… Per-depth chain budget overrides (depths 2–10)
  • βœ… Unlimited result cap (no global ceiling)
  • βœ… Full hit counting and frequency-ranked output
  • βœ… Multi-device listing and explicit device selection by index or name substring
  • βœ… Color-coded terminal output with live progress bars
  • βœ… Configurable verbosity via VERBOSE flag

⚑ Why v2 Supersedes v1

Aspect v1 (rulest.py) v2 (rulest_v2.py)
Rule validation None β€” invalid rules passed to Hashcat Full HashcatRuleValidator against GPU spec (max 255 ops)
Rule set size ~2,700 static rules 5000+ GPU-validated Hashcat single rules across 9 categories
Search strategy Naive BFS β€” every rule applied blindly Phase 1 single-rule sweep β†’ Phase 2 hot-biased chain generation
Target lookup Python set (host RAM, per-result) 16–64 MB Bloom filter uploaded once to GPU VRAM (FNV-1a, 4 hash functions)
Chain state Temp .tmp files on disk per depth In-memory, GPU buffer-based with proper release and gc.collect()
Memory management Halve batch on OOM, no VRAM awareness Dynamic sizing based on actual free VRAM estimate + 55% usage safety factor
Hit counting ❌ Not implemented βœ… Full Counter-based frequency tracking, sorted output
Device selection First platform, first device --list-devices, --device by index or name substring
Seed rules ❌ Not supported βœ… --seed-rules file; seeds used to extend chains to deeper depths
Per-depth budget ❌ Not supported βœ… --depth2-chains through --depth10-chains overrides
Output Unsorted, no metadata Sorted by frequency; header with total hits and rule count
Rule categories Simple, T/D, s, Group A + i, o, x, *, O, e, 3, p, y, Y, z, Z, L, R, +, -, ., ,, ', E, k, K, {, }, [, ], q

BFS vs. Informed Chain Generation

The core algorithmic difference matters at scale:

v1 BFS: Every word Γ— every rule at each depth level. At depth 2 with 2,700 rules and 100,000 base words: 270 million combinations per depth, with no prioritization. State must be written to disk between depths, creating an I/O bottleneck. Rules that never produce hits are retried at every depth.

v2 Informed Generation: Phase 1 identifies which individual rules ("hot rules") actually hit the target dictionary. Phase 2 then generates chains biased 60% toward hot rules (configurable via HOT_RULE_RATIO). An additional 30% of the budget extends known-good seed chains. This dramatically reduces wasted GPU cycles and finds effective multi-rule sequences far faster than exhaustive BFS.


πŸ“¦ Requirements

Python >= 3.8 numpy pyopencl tqdm 

An OpenCL-capable GPU (NVIDIA, AMD, or Intel) is required. CPU fallback via OpenCL is supported but will be slow.


πŸ›  Installation

# Clone the repository git clone https://github.com/A113L/rulest.git cd rulest # Install dependencies pip install numpy pyopencl tqdm # Verify OpenCL is available python -c "import pyopencl; print(pyopencl.get_platforms())"

Windows users: Install the appropriate OpenCL runtime for your GPU vendor. NVIDIA users typically have this via the CUDA toolkit or standard driver. AMD users should install ROCm or the AMD APP SDK.


πŸš€ Usage

rulest_v2.py β€” Full Reference

usage: rulest_v2.py [options] base_wordlist target_wordlist 

Positional Arguments

Argument Description
base_wordlist Source wordlist β€” words to transform from
target_wordlist Target dictionary β€” words to transform to

Optional Arguments

Flag Default Description
-d, --max-depth 2 Maximum rule chain depth (1–12; depths >12 warned)
-o, --output rulest_output.txt Output file path
--max-chains unlimited Hard cap on total chains generated
--target-hours 0.5 Time budget in hours; controls chain generation budget
--seed-rules None File with known-good rules/chains to use as generation seeds
--list-devices β€” Print all available OpenCL devices and exit
--device best GPU Device index (e.g. 0) or name substring (e.g. NVIDIA)
--depth2-chains dynamic Override chain generation limit for depth 2
--depth3-chains dynamic Override chain generation limit for depth 3
--depth4-chains through --depth10-chains dynamic Per-depth overrides up to depth 10
--bloom-mb dynamic Override Bloom filter size (MB); 0 = auto-scale
--sig-words 21 Number of probe words used for rule deduplication via signature
--min-word-len 10 Minimum word length for probe words used in signature computation
--allow-reject-rules off Include rejection rules (normally excluded as GPU-incompatible)
--debug off Enable verbose output (sets VERBOSE = True at runtime)

Legacy v1 Reference

usage: rulest.py -w WORDLIST [-b BASE_WORDLIST] [-d CHAIN_DEPTH] [--batch-size N] [-o OUTPUT] [-r RULES_FILE] 

πŸ— Architecture (v2)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ GPUExtractor β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Rules β”‚ β”‚ Dynamic Parameters β”‚ β”‚ β”‚ β”‚ Generator │────▢│ (VRAM-aware sizing) β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ GPUEngine β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ β”‚ Bloom Filterβ”‚ β”‚ OpenCL Kernel β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ (16–64 MB β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ VRAM) β”‚ β”‚ β”‚find_single_rulesβ”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚find_rule_chains β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Phase 1 ────────▢ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ (all words Γ— β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ single rules) β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Phase 2 ────────▢ Informed chain generation β”‚ β”‚ β”‚ β”‚ (hot-biased, + seed extension β”‚ β”‚ β”‚ β”‚ VRAM-budgeted) β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό HashcatRuleValidator β†’ GPU-safe output (.rule file) 

Two-Phase Processing

Phase 1 β€” Single Rule Sweep All base words are processed against every GPU-compatible single rule in parallel. The Bloom filter (built from the entire target wordlist and uploaded once) allows near-zero-cost hit detection on-device using FNV-1a hashing with 4 independent hash functions. Results feed a Counter of rule β†’ hit frequency.

Phase 2 β€” Informed Chain Generation Using Phase 1 hit data, chains are generated with a bias toward rules that already demonstrated effectiveness:

  • 60% of generated chains use hot rules from Phase 1 (HOT_RULE_RATIO = 0.6)
  • 30% of the budget extends known-good seed chains (EXTENSION_RATIO = 0.3)
  • 10% is allocated to random exploration

The remaining time budget (total --target-hours minus Phase 1 duration) is split evenly across requested depths. Seed rules from --seed-rules are extended to deeper depths automatically.

Bloom Filter

The on-GPU Bloom filter uses FNV-1a hashing with two seeds (0xDEADBEEF and 0xCAFEBABE) and 4 hash functions, sized between 16 MB (low-VRAM devices < 4 GB) and 64 MB (default max). Size scales logarithmically with combined wordlist size.

VRAM Management

Free VRAM is estimated as 55% of total global memory (VRAM_USAGE_FACTOR = 0.55). All batch sizes, Bloom filter allocation, and chain budgets scale proportionally based on this estimate relative to an 8 GB baseline. Devices with fewer than 4 GB cap the Bloom filter at 32 MB; the batch floor prevents starvation on very constrained hardware.


πŸ“š Rule Categories

GPUCompatibleRulesGenerator generates rules across 9 categories, all pre-validated by HashcatRuleValidator:

# Category Commands Notes
1 Simple rules l u c C t r d f p z Z q E { } [ ] k K : No arguments
2 Position-based (single digit) T D L R + - . , ' z Z y Y Digit 0–9
3 Position-based (two digits) x * O Two digits 0–9 each
4 Prefix / Suffix / Delete-char ^ $ @ Full printable ASCII (chars 32–126)
5 Substitutions s Leet-speak + alpha→digit/punctuation cross-product
6 Insertion / Overwrite i o Positions 0–9 Γ— printable character set
7 Extraction / Swap x * (non-equal positions) + O Two-digit combos
8 Duplication p y Y z Z + digit 1–9 Word/char repetition variants
9 Title case with separator e Separator-triggered title casing

The identity rule (:) is always included and written first in the output for Hashcat compatibility.


🚫 GPU Command Support

The following commands are not supported on Hashcat's GPU engine and are automatically excluded during validation:

Command(s) Reason
X 4 6 M Memory operations β€” not available on GPU
v (three-char) Not supported on GPU
Q Quit rule β€” not GPU-compatible
< > ! / ( ) = % ? Rejection rules β€” not GPU-compatible
_ Reject-if-length β€” not GPU-compatible

Any rule exceeding 255 operations is also rejected regardless of individual command validity.


βš™οΈ Configuration Constants

These constants are defined at the top of rulest_v2.py and can be tuned for advanced use:

Constant Default Description
VERBOSE False Print per-rule validation messages and category counts; set at runtime via --debug
VRAM_USAGE_FACTOR 0.55 Fraction of device global memory to treat as free VRAM
BLOOM_HASH_FUNCTIONS 4 Number of FNV-1a hash functions in Bloom filter
BLOOM_FILTER_MAX_MB 256 Maximum Bloom filter allocation (MB); override at runtime with --bloom-mb
HOT_RULE_RATIO 0.6 Fraction of Phase 2 chains biased toward hot rules
EXTENSION_RATIO 0.3 Fraction of Phase 2 budget allocated to seed extension
TIME_SAFETY_FACTOR 0.9 Multiplier applied to time-budget combo estimates
MAX_GPU_RULES 255 Maximum operations allowed per rule chain
BASELINE_COMBOS_PER_SEC 120,000,000 Estimated throughput on a capable GPU
LOW_END_COMBOS_PER_SEC 40,000,000 Throughput fallback for devices with < 20 compute units
MAX_WORD_LEN 256 Maximum word length accepted from wordlists
MAX_RULE_LEN 16 Maximum single rule string length in GPU buffers
MAX_OUTPUT_LEN 512 Maximum transformed word output length in GPU buffers
MAX_CHAIN_STRING_LEN 128 Maximum chained rule string length in GPU buffers
MAX_HASHCAT_CHAIN 31 Maximum number of rules in a single Hashcat chain

πŸ“„ Output Format

rulest_output.txt (or your specified -o path):

# Generated by GPU rules engine (full kernel) # Total unique rules: 4821 # Total hits: 2193047 : c $1 u l $1 c $! sa@ $0 ... 
  • The identity rule (:) is always written first for Hashcat compatibility
  • Rules are sorted by hit frequency (descending), then alphabetically
  • All rules are guaranteed GPU-valid (max 255 ops, correct argument syntax)
  • Encoding is latin-1 to preserve byte-level fidelity with Hashcat's expected input

πŸŽ› Performance Tuning

Goal Recommendation
Maximize coverage in fixed time Increase --target-hours
Reduce VRAM pressure Lower --max-chains or use --depth2-chains / --depth3-chains
Force deep chain exploration Set --depth4-chains 50000 --depth5-chains 10000 explicitly
Use a specific GPU --device 1 or --device "RTX 4090"
Bootstrap from prior results Pass previous output to --seed-rules for iterative refinement
Limit total combinations --max-chains 500000 to cap generation before scaling
Reduce terminal noise Set VERBOSE = False in the script header or omit --debug
Increase hot-rule aggressiveness Raise HOT_RULE_RATIO toward 1.0 (reduces random exploration)

VRAM Scaling Reference

Available VRAM Scale Factor Bloom Filter Cap
< 4 GB 0.25–0.5Γ— 32 MB
4–8 GB 0.5–1.0Γ— 64 MB
8 GB+ 1.0Γ— (full) 64 MB

πŸ’‘ Examples

Basic single-depth extraction:

python rulest_v2.py rockyou.txt target_hashes_plain.txt -d 1 -o single_rules.txt

Deep chain search with a 2-hour budget:

python rulest_v2.py rockyou.txt target.txt -d 4 --target-hours 2.0 -o chains_deep.txt

Use a specific GPU and seed from a previous run:

python rulest_v2.py base.txt target.txt \ --device "RTX 3080" \ --seed-rules single_rules.txt \ -d 3 --target-hours 1.0 \ -o refined_chains.txt

List available OpenCL devices:

python rulest_v2.py --list-devices

Override chain budget for specific depths:

python rulest_v2.py base.txt target.txt -d 5 \ --depth2-chains 200000 \ --depth3-chains 100000 \ --depth4-chains 30000 \ --depth5-chains 5000 \ -o custom_budget.txt

Iterative refinement workflow:

# Pass 1 β€” fast sweep for single rules python rulest_v2.py rockyou.txt target.txt -d 1 --target-hours 0.25 -o pass1.txt # Pass 2 β€” chain from pass 1 results python rulest_v2.py rockyou.txt target.txt -d 3 --target-hours 1.0 \ --seed-rules pass1.txt -o pass2.txt # Pass 3 β€” deep dive seeded from pass 2 python rulest_v2.py rockyou.txt target.txt -d 5 --target-hours 4.0 \ --seed-rules pass2.txt -o pass3_final.txt

πŸ“ License

MIT

Credits

https://github.com/synacktiv/rulesfinder

Releases

No releases published

Packages

 
 
 

Contributors

Languages