Click here to see v1 launch video on youtube 
speakmcp-vid.mp4
Platform Support: macOS (Apple Silicon & Intel) with full MCP agent functionality.
β οΈ Windows/Linux: MCP tools not currently supported β see v0.2.2 for dictation-only builds.
Voice Recording:
- Hold
Ctrl(macOS/Linux) orCtrl+/(Windows) to start recording - Release to stop recording and transcribe
- Text is automatically inserted into your active application
MCP Agent Mode (macOS only):
- Hold
Ctrl+Altto start recording for agent mode - Release
Ctrl+Altto process with MCP tools - Watch real-time progress as the agent executes tools
- Results are automatically inserted or displayed
Text Input:
Ctrl+T(macOS/Linux) orCtrl+Shift+T(Windows) for direct typing
| Category | Capabilities |
|---|---|
| π€ Voice | Hold-to-record, 30+ languages, Fn toggle mode, auto-insert to any app |
| π TTS | 50+ AI voices via OpenAI, Groq, and Gemini with auto-play |
| π€ MCP Agent | Tool execution, OAuth 2.1 auth, real-time progress, conversation context |
| π Observability | Langfuse integration for LLM tracing, token usage, and debugging |
| π οΈ Platform | macOS/Windows/Linux, rate limit handling, multi-provider AI |
| π¨ UX | Dark/light themes, resizable panels, kill switch, conversation history |
git clone https://github.com/aj47/SpeakMCP.git && cd SpeakMCP pnpm install && pnpm build-rs && pnpm devSee DEVELOPMENT.md for full setup, build commands, troubleshooting, and architecture details.
AI Providers β Configure in settings:
- OpenAI, Groq, or Google Gemini API keys
- Model selection per provider
- Custom base URLs (optional)
MCP Servers β Add tools in mcpServers JSON format:
{ "mcpServers": { "filesystem": { "command": "npx", "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path"] } } }Keyboard Shortcuts:
| Shortcut | Action |
|---|---|
Hold Ctrl / Ctrl+/ (Win) | Voice recording |
Fn | Toggle dictation on/off |
Hold Ctrl+Alt | MCP agent mode (macOS) |
Ctrl+T / Ctrl+Shift+T (Win) | Text input |
Ctrl+Shift+Escape | Kill switch |
We welcome contributions! Fork the repo, create a feature branch, and open a Pull Request.
π¬ Get help on Discord | π More info at techfren.net
This project is licensed under the AGPL-3.0 License.
Built on Whispo β’ Powered by OpenAI, Anthropic, Groq, Google β’ MCP β’ Electron β’ React β’ Rust
Made with β€οΈ by the SpeakMCP team
