Skip to content

Zerowidthstego: Invisible data. Unforgettable power. A full-spectrum zero-width steganography engine for cyber defense, ctf teams, and digital minimalists.

License

Notifications You must be signed in to change notification settings

ridpath/ZeroWidthStego

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

ZeroWidthStego

Invisible Unicode Steganography Toolkit

Python 3.7+ MIT License Platform: Cross-Platform Covert Data Channel

Python Compatibility:

  • Requires Python 3.7+
  • License: MIT

Overview: ZeroWidthStego is a comprehensive, modular steganography and detection tool built in Python. It focuses on the abuse and analysis of zero width Unicode characters and homoglyph based text obfuscation, supporting multiple encoding schemes for covert data embedding and extraction.

This tool provides:

  • Encoding and decoding logic
  • Auto-detection of steganographic schemes
  • Static and semantic analysis
  • High-density data hiding techniques
  • Integration into CTFs, CI/CD pipelines, and source code workflows

Built For:

  • Security researchers and threat analysts
  • CTF players and tool developers
  • Red and blue teams
  • Source code integrity auditors

Key Features

  • 12+ Encoding Schemes: Binary, Quaternary, Octal, UTF-8, UTF-16, Directional, Homoglyph, and more
  • ZWSP Spacing Mode with full decode support
  • Homoglyph substitution with carrier aware injection
  • AES Encryption Layer (optional symmetric encryption)
  • Threshold-based Encoding/Decoding
  • Advanced Brute-Force Mode with base switching and heuristics
  • Static and Dynamic Stego Detection Engine
  • Entropy & Flag Pattern Analysis (flag{}, picoctf{}, etc.)
  • Carrier file injection and extraction with cleanup control

Installation

 git clone https://github.com/yourusername/zero-width-steganography.git cd ZeroWidthStego # No dependencies required - pure Python 3 

Basic Usage

Encoding

python zero_width_tool.py encode "secret message" -o hidden.txt python zero_width_tool.py encode -i secret.txt -o hidden.txt python zero_width_tool.py encode "flag{hidden}" -o output.txt --carrier normal_text.txt

Decoding

python zero_width_tool.py decode -i hidden.txt python zero_width_tool.py decode -i hidden.txt -o decoded.txt python zero_width_tool.py decode -i hidden.txt --scheme simple_8bit

Analysis

python zero_width_tool.py analyze -i suspicious.txt python zero_width_tool.py analyze -i document.txt

Advanced Usage

Different Encoding Schemes

python zero_width_tool.py encode "message" -o output.txt --scheme simple_8bit python zero_width_tool.py encode "message" -o output.txt --scheme basic_utf16 python zero_width_tool.py encode "message" -o output.txt --scheme quaternary_utf8 python zero_width_tool.py encode "secret" -o output.txt --scheme homoglyph_binary_utf8 --carrier carrier.txt python zero_width_tool.py encode "message" -o output.txt --scheme zwsp_spacing --carrier carrier.txt

Pipeline Examples

echo "secret message" \ | python zero_width_tool.py encode -i - --scheme simple_8bit \ | python zero_width_tool.py decode -i - for file in *.txt; do echo "=== $file ===" python zero_width_tool.py analyze -i "$file" done

Common Use Cases

CTF Challenges:

python zero_width_tool.py decode -i challenge.txt --scheme basic_utf16 python zero_width_tool.py decode -i challenge.txt --scheme quaternary_utf8

Security Analysis:

find . -name "*.txt" -exec python zero_width_tool.py analyze -i {} \;

Steganography Operations:

python zero_width_tool.py decode -i memo.txt -o extracted.txt

Encoding Scheme Table

Scheme Name Bits/Symbol Description
simple_8bit 1-bit ZWSP=0, ZWNJ=1 (exact decoder logic)
basic_utf8 1-bit Standard UTF-8 encoding with ZWSP/ZWNJ
basic_utf16 1-bit UTF-16, 16-bit encoding units
quaternary_utf8 2-bit 2-bit per symbol using ZWSP, ZWNJ, ZWJ, WJ
octal_utf8 3-bit 3-bit per symbol using 8 zero-width characters
binary_directional_utf8 1-bit Uses directional LRM=0, RLM=1
homoglyph_binary_utf8 1-bit Homoglyph substitution (Cyrillic & Latin)
zwsp_spacing Pattern Letters separated by ZWSP

MITRE ATT&CK Mapping

Adversary Behavior Category Mapping ID
Invisible Covert Channel Command & Control T1001.003
Payload Injection in Text Defense Evasion T1027
Source Code Backdooring Supply Chain Comp. T1195
Data Exfiltration via Unicode Exfiltration T1048
C2 Signaling in Comments Defense Evasion T1564.004
Content Obfuscation Impact T1498

Troubleshooting

python zerowidthstego.py decode -i file.txt --scheme basic_utf16 python zerowidthstego.py decode -i file.txt --scheme zwsp_spacing python zerowidthstego.py decode -i file.txt --force python zerowidthstego.py analyze -i file.txt

Quick Reference Card

 python zerowidthstego.py encode "msg" -o hidden.txt python zerowidthstego.py decode -i hidden.txt python zerowidthstego.py analyze -i file.txt python zerowidthstego.py encode "msg" -o out.txt --carrier text.txt --scheme homoglyph_binary_utf8 python zerowidthstego.py decode -i file.txt --scheme basic_utf16 --force 

Developer Integration

Python: from zwstego import ZeroWidthEncoder, EncodingScheme encoder = ZeroWidthEncoder(EncodingScheme.SIMPLE_8BIT) stego = encoder.encode("flag{hidden_data}") decoded = encoder.decode(stego) 

Extend by modifying SCHEMES dictionary and EncodingScheme enum.


Zero Width Steganography

Summary: Zero width character steganography is a covert encoding method using invisible Unicode control characters. These characters are not visible to humans or rendered by most editors, but persist through storage, transmission, and copy/paste.

Key Properties:

  • Invisible across platforms and apps
  • Persists through re-encoding and copy/paste
  • Bypasses naive filters
  • Undetected by most antivirus tools and SIEM/DLP

Encoding Techniques:

  • Binary using ZWSP/ZWNJ
  • 2-bit schemes using ZWJ and WJ as well
  • 3-bit octal using eight zero width characters
  • Homoglyph substitution
  • Compressed binary → zero width hybrid

Insertion Strategies:

  • HTML/JS comments
  • JSON/YAML/Markdown
  • Software documentation & Git commits
  • Natural language stego in text

Malicious Use:

  • Supply chain attacks
  • Payload concealment & exfiltration
  • Hidden C2 signaling
  • Code integrity compromise

Legitimate Use:

  • Watermarking and attribution
  • Anti-scraping defense
  • Secure provenance verification

Detection:

  • Regex: [\u200B-\u200D\u202A-\u202E\u2060\uFEFF]
  • Display control characters in editors
  • Printable vs. invisible entropy analysis
  • Hunt for CTF markers: flag{}, ctf{}, picoctf{}

Countermeasures:

  • Strip zero width characters by default
  • Unicode normalization NFKC/NFKD
  • CI checks for supply chain integrity
  • SOC awareness and incident playbooks

License

MIT License. Attribution appreciated.


Acknowledgments

Based on:

  • Steganography and Unicode security research
  • CTF adversarial tradecraft
  • Red team stealth and evasion methodologies
  • Industry cases of text-based payload delivery

‌‌‌‌‌‌‌‌‌​‌​​‌​‌‌‌‌‌‌‌‌‌‌​​‌‌​‌​‌‌‌‌‌‌‌‌‌​​​‌‌​‌‌‌‌‌‌‌‌‌‌​​‌​​​​‌‌‌‌‌‌‌‌‌‌​‌‌‌‌‌‌‌‌‌‌‌‌‌‌​‌​‌​​​‌‌‌‌‌‌‌‌‌​​‌​‌‌​‌‌‌‌‌‌‌‌‌​​‌‌​‌‌‌‌‌‌‌‌‌‌‌​​​‌​‌‌‌‌‌‌‌‌‌‌‌​​‌​‌‌‌‌‌‌‌‌‌‌‌‌‌​‌‌‌‌‌‌‌‌‌‌‌‌‌‌​‌​‌‌​​‌‌‌‌‌‌‌‌‌​​​‌​‌‌‌‌‌‌‌‌‌‌‌​​‌‌​‌​‌‌‌‌‌‌‌‌‌​​‌‌​​​‌‌‌‌‌‌‌‌‌​​‌​​​​‌‌‌‌‌‌‌‌‌‌‌‌​‌​‌‌‌‌‌‌‌‌‌‌​‌​‌​‌‌‌‌‌‌‌‌‌‌‌​​‌​‌‌‌‌‌‌‌‌‌‌‌‌​​‌‌​‌​‌‌‌‌‌‌‌‌‌‌​‌‌‌‌‌‌‌‌‌‌‌‌‌‌​​​‌​​​‌‌‌‌‌‌‌‌‌​​‌​​​​‌‌‌‌‌‌‌‌‌​​​‌‌​‌‌‌‌‌‌‌‌‌‌​​‌‌​‌‌‌‌‌‌‌‌‌‌‌​​​‌‌​​‌‌‌‌‌‌‌‌‌‌​‌‌‌‌‌‌‌‌‌‌‌‌‌‌​​‌‌‌​‌‌‌‌‌‌‌‌‌‌​​‌‌​‌​‌‌‌‌‌‌‌‌‌​​​‌​‌‌‌‌‌‌‌‌‌‌‌​​​‌​​​‌‌‌‌‌‌‌‌‌​​‌‌​‌​‌‌‌‌‌‌‌‌‌​​‌‌​‌​‌‌‌‌‌‌‌‌‌​​‌​​​‌‌‌‌‌‌‌‌‌‌‌​‌‌‌‌‌‌‌‌‌‌‌‌‌‌​​​‌​​​‌‌‌‌‌‌‌‌‌​​‌​​​​‌‌‌‌‌‌‌‌‌​​​‌‌​‌‌‌‌‌‌‌‌‌‌​​‌‌​‌‌‌‌‌‌‌‌‌‌‌​​​‌‌​​‌‌‌‌‌‌‌‌‌‌​‌​​​‌‌‌‌‌‌‌‌‌‌‌‌‌​‌​‌‌‌‌‌‌‌‌‌‌​‌​‌​‌‌‌‌‌‌‌‌‌‌‌​​‌​‌‌‌‌‌‌‌‌‌‌‌‌​​‌‌​‌​‌‌‌‌‌‌‌‌‌‌​‌‌‌‌‌‌‌‌‌‌‌‌‌‌​​​‌‌​​‌‌‌‌‌‌‌‌‌​​‌‌​‌​‌‌‌‌‌‌‌‌‌​​‌‌​‌​‌‌‌‌‌‌‌‌‌​​​‌‌​‌‌‌‌‌‌‌‌‌‌‌​‌‌​​​‌‌‌‌‌‌‌‌‌​​​‌‌​​‌‌‌‌‌‌‌‌‌‌​‌‌‌‌‌‌‌‌‌‌‌‌‌‌​​​‌‌​​‌‌‌‌‌‌‌‌‌​​‌​‌‌​‌‌‌‌‌‌‌‌‌​​‌‌​​​‌‌‌‌‌‌‌‌‌​​‌​​​‌‌‌‌‌‌‌‌‌‌​​‌‌‌‌​‌‌‌‌‌‌‌‌‌​​​‌​‌‌‌‌‌‌‌‌‌‌‌​​​‌​‌​‌‌‌‌‌‌‌‌‌​​​‌‌​‌‌‌‌‌‌‌‌‌‌​​‌‌​‌​‌‌‌‌‌‌‌‌‌‌​‌‌‌‌‌‌‌‌‌‌‌‌‌‌​​‌​​‌‌‌‌‌‌‌‌‌‌‌​​‌​‌‌​‌‌‌‌‌‌‌‌‌​​​‌​​‌‌‌‌‌‌‌‌‌‌​​‌‌​‌​‌‌‌‌‌‌‌‌‌​​​‌‌​​‌‌‌‌‌‌‌‌‌‌​‌‌‌‌‌‌‌‌‌‌‌‌‌‌​​‌​‌‌‌‌‌‌‌‌‌‌‌‌​​‌‌​‌​‌‌‌‌‌‌‌‌‌​​​‌‌​‌‌‌‌‌‌‌‌‌‌​​‌‌​‌​‌‌‌‌‌‌‌‌‌‌​‌​​​‌‌‌‌‌‌‌‌‌‌‌‌‌​‌​‌‌‌‌‌‌‌‌‌‌​​‌​‌‌‌‌‌‌‌‌‌‌‌‌​​​‌​‌‌‌‌‌‌‌‌‌‌‌​​​‌​‌‌‌‌‌‌‌‌‌‌‌​​​‌‌‌‌‌‌‌‌‌‌‌‌‌​​​‌‌​​‌‌‌‌‌‌‌‌‌‌​​​‌​‌‌‌‌‌‌‌‌‌‌‌​‌​​​​‌‌‌‌‌‌‌‌‌‌​‌​​​​‌‌‌‌‌‌‌‌‌​​‌‌​​​‌‌‌‌‌‌‌‌‌​​‌​‌‌​‌‌‌‌‌‌‌‌‌​​​‌​‌‌‌‌‌‌‌‌‌‌‌​​‌​‌‌‌‌‌‌‌‌‌‌‌‌​​​‌​‌​‌‌‌‌‌‌‌‌‌​​‌‌‌​‌‌‌‌‌‌‌‌‌‌‌​‌​​​‌‌‌‌‌‌‌‌‌‌​​‌‌‌​​‌‌‌‌‌‌‌‌‌​​‌​​​​‌‌‌‌‌‌‌‌‌​​‌​​‌​‌‌‌‌‌‌‌‌‌‌​‌​​​​‌‌‌‌‌‌‌‌‌​​​‌‌​‌‌‌‌‌‌‌‌‌‌​​‌​‌‌​‌‌‌‌‌‌‌‌‌​​‌‌​‌‌‌‌‌‌‌‌‌‌‌​​​‌‌‌‌‌‌‌‌‌‌‌‌‌​​‌‌‌‌​‌‌‌‌‌‌‌‌‌​​​‌​‌‌‌‌‌‌‌‌‌‌‌​​‌​‌‌‌ "Not all things need to be encrypted to be hidden. And not all who hide deceive."