GitHub - aj47/SpeakMCP: Spawn agents anywhere in one keypress

🎬 Preview

v1.4

Click here to see v1 launch video on youtube

speakmcp-vid.mp4

🚀 Quick Start

Download

Platform Support: macOS (Apple Silicon & Intel) with full MCP agent functionality. ⚠️ Windows/Linux: MCP tools not currently supported — see v0.2.2 for dictation-only builds.

Basic Usage

Voice Recording:

Hold Ctrl (macOS/Linux) or Ctrl+/ (Windows) to start recording
Release to stop recording and transcribe
Text is automatically inserted into your active application

MCP Agent Mode (macOS only):

Hold Ctrl+Alt to start recording for agent mode
Release Ctrl+Alt to process with MCP tools
Watch real-time progress as the agent executes tools
Results are automatically inserted or displayed

Text Input:

Ctrl+T (macOS/Linux) or Ctrl+Shift+T (Windows) for direct typing

✨ Features

Category	Capabilities
🎤 Voice	Hold-to-record, 30+ languages, Fn toggle mode, auto-insert to any app
🔊 TTS	50+ AI voices via OpenAI, Groq, and Gemini with auto-play
🤖 MCP Agent	Tool execution, OAuth 2.1 auth, real-time progress, conversation context
📊 Observability	Langfuse integration for LLM tracing, token usage, and debugging
🛠️ Platform	macOS/Windows/Linux, rate limit handling, multi-provider AI
🎨 UX	Dark/light themes, resizable panels, kill switch, conversation history

🛠️ Development

git clone https://github.com/aj47/SpeakMCP.git && cd SpeakMCP pnpm install && pnpm build-rs && pnpm dev

See DEVELOPMENT.md for full setup, build commands, troubleshooting, and architecture details.

⚙️ Configuration

AI Providers — Configure in settings:

OpenAI, Groq, or Google Gemini API keys
Model selection per provider
Custom base URLs (optional)

MCP Servers — Add tools in mcpServers JSON format:

{ "mcpServers": { "filesystem": { "command": "npx", "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path"] } } }

Keyboard Shortcuts:

Shortcut	Action
Hold `Ctrl` / `Ctrl+/` (Win)	Voice recording
`Fn`	Toggle dictation on/off
Hold `Ctrl+Alt`	MCP agent mode (macOS)
`Ctrl+T` / `Ctrl+Shift+T` (Win)	Text input
`Ctrl+Shift+Escape`	Kill switch

🤝 Contributing

We welcome contributions! Fork the repo, create a feature branch, and open a Pull Request.

💬 Get help on Discord | 🌐 More info at techfren.net

📄 License

This project is licensed under the AGPL-3.0 License.

🙏 Acknowledgments

Built on Whispo • Powered by OpenAI, Anthropic, Groq, Google • MCP • Electron • React • Rust

Made with ❤️ by the SpeakMCP team

Name		Name	Last commit message	Last commit date
Latest commit History 2,504 Commits
.github/workflows		.github/workflows
.vscode		.vscode
apps		apps
packages		packages
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitignore		.gitignore
.npmrc		.npmrc
.nvmrc		.nvmrc
.prettierignore		.prettierignore
.prettierrc		.prettierrc
BUILDING.md		BUILDING.md
DEVELOPMENT.md		DEVELOPMENT.md
LICENSE		LICENSE
README.md		README.md
agents.md		agents.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎬 Preview

🚀 Quick Start

Download

Basic Usage

✨ Features

🛠️ Development

⚙️ Configuration

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases 19

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎬 Preview

🚀 Quick Start

Download

Basic Usage

✨ Features

🛠️ Development

⚙️ Configuration

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 19

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages