Barkdown

Voice-to-prompt pipeline: wake word detection → streaming STT → Claude refinement.

flowchart LR subgraph Input M[🎙️ Microphone] end subgraph Pipeline W[Wake Word\nDetection] T[Streaming\nTranscription] C[Claude\nRefinement] end subgraph Output CB[📋 Clipboard] end M --> W --> T --> C --> CB

Features

Wake word activation — Say "Hey Jarvis" to start (hands-free)
Real-time streaming — See transcription as you speak
Best-in-class accuracy — Parakeet TDT 0.6B v3 (#1 on HuggingFace Open ASR Leaderboard)
Auto endpoint detection — Recording stops automatically when you stop speaking
Claude refinement — Transform raw dictation into polished LLM prompts
GPU-accelerated — <100ms inference on RTX 4090

Installation

NixOS / Nix Flakes (Recommended)

The project uses a Nix flake with devenv for reproducible development environments with CUDA support.

Prerequisites:

Nix with flakes enabled
direnv with nix-direnv
NVIDIA GPU with drivers installed

# Clone and enter the directory git clone https://github.com/lokaltog/barkdown.git cd barkdown # Allow direnv (loads the flake automatically) direnv allow # The environment provides: # - Python 3.13 with uv # - CUDA 12.x + cuDNN # - PortAudio (connects to PulseAudio/PipeWire) # - Audio tools (ffmpeg, sox)

On first load, direnv activates the flake and displays:

🎤 Parakeet ASR Development Environment ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Python: Python 3.13.x CUDA: 12.x.x cuDNN: 9.x.x GPU: NVIDIA GeForce RTX 4090

Manual Installation

# Requires Python 3.13+, CUDA 12.x, and system audio libraries pip install -e . # or with uv uv sync

Quick Start

barkdown # Default loop mode barkdown --auto-refine # Skip confirmation, auto-send to Claude barkdown --studio # Studio preset (aggressive VAD) barkdown --long-form # Long recording preset barkdown --no-loop # Single transcription then exit

Architecture

flowchart TB subgraph TUI["Terminal UI (Textual)"] APP[BarkdownApp] PC[PipelineController] SP[Spectrogram] HP[History Panel] PP[Preview Panel] end subgraph Pipeline["Pipeline (Async)"] ORCH[PipelineOrchestrator] WWD[WakeWordDetector\nOpenWakeWord/CPU] STR[StreamingTranscriber\nParakeet TDT/GPU] REF[ClaudeRefiner\nclaude-agent-sdk] VAD[Silero VAD] end subgraph Persistence HIST[(history.json)] end APP --> PC PC <-->|events| ORCH ORCH --> WWD ORCH --> STR ORCH --> REF STR --> VAD APP --> SP APP --> HP APP --> PP HP <--> HIST

Key Components

Component	Technology	Purpose
Wake Word	OpenWakeWord (CPU)	Detects "Hey Jarvis" activation phrase
ASR	NVIDIA Parakeet TDT 0.6B (GPU)	Streaming speech-to-text
VAD	Silero VAD	Voice activity detection for auto-stop
Refinement	Claude via claude-agent-sdk	Transforms dictation into LLM prompts
TUI	Textual	Async terminal interface
Persistence	JSON	History storage at `~/.local/share/barkdown/history.json`

State Machine

The pipeline operates as an event-driven state machine:

stateDiagram-v2 [*] --> IDLE IDLE --> LOADING: StartListening LOADING --> LISTENING_WAKE: ModelsLoaded LISTENING_WAKE --> RECORDING: WakeWordDetected LISTENING_WAKE --> IDLE: AbortOperation RECORDING --> TRANSCRIBING: SilenceDetected RECORDING --> IDLE: AbortOperation TRANSCRIBING --> AWAITING_CONFIRM: !auto_refine TRANSCRIBING --> REFINING: auto_refine TRANSCRIBING --> IDLE: AbortOperation AWAITING_CONFIRM --> REFINING: ConfirmRefinement(yes) AWAITING_CONFIRM --> COMPLETE: ConfirmRefinement(no) AWAITING_CONFIRM --> IDLE: AbortOperation REFINING --> COMPLETE: RefinementDone REFINING --> IDLE: AbortOperation COMPLETE --> LISTENING_WAKE: ContinueLoop COMPLETE --> IDLE: !loop_mode COMPLETE --> REFINING: RetryRefinement ERROR --> IDLE: ResetFromError note right of LISTENING_WAKE: CPU-only\n(OpenWakeWord) note right of RECORDING: VAD monitors\nfor silence note right of TRANSCRIBING: GPU inference\n(Parakeet) note right of REFINING: Claude API call

States

State	Description
`IDLE`	Waiting for user to initiate
`LOADING`	Loading ML models (ASR, wake word)
`LISTENING_WAKE`	Listening for wake word on CPU
`RECORDING`	Recording audio, VAD monitors for silence
`TRANSCRIBING`	Processing audio through Parakeet ASR
`AWAITING_CONFIRM`	Showing transcription, waiting for user to confirm refinement
`REFINING`	Claude is transforming raw text into polished prompt
`COMPLETE`	Output ready, copied to clipboard
`ERROR`	Error state, can reset to IDLE

Events

User Actions:

StartListening — Begin the pipeline
AbortOperation — Cancel current operation (works from any interruptible state)
ConfirmRefinement — Confirm or skip Claude refinement
RetryRefinement — Re-run refinement from COMPLETE state
ResetFromError — Clear error and return to IDLE

Internal Events:

ModelsLoaded — Models ready, proceed to listening
WakeWordDetected — Wake phrase detected, start recording
SilenceDetected — Silence threshold reached, stop recording
TranscriptionComplete — ASR finished
RefinementDone — Claude refinement complete
ContinueLoop — Continue to next iteration (loop mode)

Data Flow

sequenceDiagram autonumber participant User participant TUI as BarkdownApp participant Orch as PipelineOrchestrator participant Wake as WakeWordDetector participant ASR as StreamingTranscriber participant Claude as ClaudeRefiner participant Clip as Clipboard User->>TUI: Press Enter TUI->>Orch: StartListening Orch->>Orch: Load models (if needed) Orch->>Wake: Start detection User->>Wake: "Hey Jarvis" Wake-->>Orch: WakeWordDetected Orch->>ASR: Start streaming loop While speaking User->>ASR: Audio chunks ASR-->>TUI: Partial transcription ASR->>ASR: VAD monitoring end ASR-->>Orch: SilenceDetected Orch->>ASR: Finalize ASR-->>Orch: TranscriptionComplete alt auto_refine = false Orch-->>TUI: AWAITING_CONFIRM TUI->>User: Show transcription User->>TUI: Confirm (Y) TUI->>Orch: ConfirmRefinement(yes) end Orch->>Claude: Refine raw text Claude-->>Orch: RefinementDone Orch-->>TUI: COMPLETE TUI->>Clip: Copy refined text TUI->>User: Show result

Configuration

Configuration via ~/.config/barkdown/config.toml or environment variables (BARKDOWN_*):

# Audio sample_rate = 16000 channels = 1 # Wake word wake_word_model = "hey_jarvis" wake_word_threshold = 0.5 # ASR parakeet_model = "nvidia/parakeet-tdt-0.6b-v3" # VAD / Recording vad_threshold = 0.5 silence_threshold_sec = 1.5 max_recording_sec = 60.0 # Claude claude_model = "claude-opus-4-5-20251101" auto_refine = false # Audio processing normalize_audio = false strip_silence = true

Presets

Studio (--studio) — For high-quality studio equipment:

Higher VAD threshold (0.7)
Faster endpoint detection (1.0s silence)
LUFS normalization enabled

Long-form (--long-form) — For meetings, lectures:

Lower VAD threshold (0.3)
Longer silence tolerance (3.0s)
1 hour max recording
Auto-chunking for memory efficiency

Keyboard Shortcuts

Key	Action
`Enter`	Start listening / Confirm
`Y`	Confirm refinement
`N`	Skip refinement
`Escape`	Abort current operation
`R`	Retry refinement
`C`	Copy refined text
`Shift+C`	Copy raw transcription
`?`	Show help
`Q`	Quit

Requirements

Hardware:

NVIDIA GPU with CUDA 12.x support
~4GB VRAM (Parakeet TDT 0.6B)
Microphone

Software (handled by Nix flake):

Python 3.13+
CUDA 12.x + cuDNN
PulseAudio/PipeWire + PortAudio

API Keys:

ANTHROPIC_API_KEY for Claude refinement

Development

With Nix (recommended)

# direnv auto-loads the environment on cd cd barkdown # Install Python dependencies (auto-runs via devenv) uv sync --extra dev # Run the app barkdown # Run tests (use cuda-uv for GPU tests) cuda-uv run pytest # Type checking pyright # Linting ruff check .

Without Nix

Ensure CUDA 12.x, cuDNN, and PortAudio are installed system-wide, then:

uv sync --extra dev uv run barkdown

Design Principles

Async-first — All pipeline components use async/await with thread executors for blocking I/O
Process isolation — Pipeline runs in subprocess; PyTorch/CUDA blocks asyncio even with executors
Event-driven — State machine with explicit transitions and events
Lazy loading — Heavy dependencies (NeMo, OpenWakeWord) imported on first use for fast startup
Non-blocking TUI — Textual message passing keeps UI responsive

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
barkdown		barkdown
tests		tests
.envrc		.envrc
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
flake.lock		flake.lock
flake.nix		flake.nix
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Barkdown

Features

Installation

NixOS / Nix Flakes (Recommended)

Manual Installation

Quick Start

Architecture

Key Components

State Machine

States

Events

Data Flow

Configuration

Presets

Keyboard Shortcuts

Requirements

Development

With Nix (recommended)

Without Nix

Design Principles

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Barkdown

Features

Installation

NixOS / Nix Flakes (Recommended)

Manual Installation

Quick Start

Architecture

Key Components

State Machine

States

Events

Data Flow

Configuration

Presets

Keyboard Shortcuts

Requirements

Development

With Nix (recommended)

Without Nix

Design Principles

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages