Skip to content

aj47/SpeakMCP

Β 
Β 

Repository files navigation

License: AGPL-3.0 Electron TypeScript React

🎬 Preview

v1.4 image

Click here to see v1 launch video on youtube image

speakmcp-vid.mp4

πŸš€ Quick Start

Download

πŸ“₯ Download Latest Release

Platform Support: macOS (Apple Silicon & Intel) with full MCP agent functionality. ⚠️ Windows/Linux: MCP tools not currently supported β€” see v0.2.2 for dictation-only builds.

Basic Usage

Voice Recording:

  1. Hold Ctrl (macOS/Linux) or Ctrl+/ (Windows) to start recording
  2. Release to stop recording and transcribe
  3. Text is automatically inserted into your active application

MCP Agent Mode (macOS only):

  1. Hold Ctrl+Alt to start recording for agent mode
  2. Release Ctrl+Alt to process with MCP tools
  3. Watch real-time progress as the agent executes tools
  4. Results are automatically inserted or displayed

Text Input:

  • Ctrl+T (macOS/Linux) or Ctrl+Shift+T (Windows) for direct typing

✨ Features

Category Capabilities
🎀 Voice Hold-to-record, 30+ languages, Fn toggle mode, auto-insert to any app
πŸ”Š TTS 50+ AI voices via OpenAI, Groq, and Gemini with auto-play
πŸ€– MCP Agent Tool execution, OAuth 2.1 auth, real-time progress, conversation context
πŸ“Š Observability Langfuse integration for LLM tracing, token usage, and debugging
πŸ› οΈ Platform macOS/Windows/Linux, rate limit handling, multi-provider AI
🎨 UX Dark/light themes, resizable panels, kill switch, conversation history

πŸ› οΈ Development

git clone https://github.com/aj47/SpeakMCP.git && cd SpeakMCP pnpm install && pnpm build-rs && pnpm dev

See DEVELOPMENT.md for full setup, build commands, troubleshooting, and architecture details.

βš™οΈ Configuration

AI Providers β€” Configure in settings:

  • OpenAI, Groq, or Google Gemini API keys
  • Model selection per provider
  • Custom base URLs (optional)

MCP Servers β€” Add tools in mcpServers JSON format:

{ "mcpServers": { "filesystem": { "command": "npx", "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path"] } } }

Keyboard Shortcuts:

Shortcut Action
Hold Ctrl / Ctrl+/ (Win) Voice recording
Fn Toggle dictation on/off
Hold Ctrl+Alt MCP agent mode (macOS)
Ctrl+T / Ctrl+Shift+T (Win) Text input
Ctrl+Shift+Escape Kill switch

🀝 Contributing

We welcome contributions! Fork the repo, create a feature branch, and open a Pull Request.

πŸ’¬ Get help on Discord | 🌐 More info at techfren.net

πŸ“„ License

This project is licensed under the AGPL-3.0 License.

πŸ™ Acknowledgments

Built on Whispo β€’ Powered by OpenAI, Anthropic, Groq, Google β€’ MCP β€’ Electron β€’ React β€’ Rust


Made with ❀️ by the SpeakMCP team

About

Spawn agents anywhere in one keypress

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • TypeScript 97.3%
  • JavaScript 0.9%
  • Rust 0.5%
  • CSS 0.4%
  • Shell 0.4%
  • PowerShell 0.3%
  • Other 0.2%