A local LLM caching reverse proxy server designed to emulate major LLM providers with advanced testing capabilities.
Rubberduck provides caching, failure simulation, rate limiting, per-user proxy instances, and detailed logging for testing and development of LLM-powered applications.
- Supports OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, and Google Vertex AI
- Perfect request/response compatibility with official SDKs
- Transparent header and authentication passthrough
- SHA-256 cache keys based on normalized request bodies
- Only successful responses (2xx) are cached
- Manual cache invalidation per proxy instance
- Respects upstream provider caching headers
- Timeouts: Fixed delays or indefinite hangs
- Error Injection: Configurable HTTP error codes (429, 500, 400) with individual rates
- IP Filtering: Allow/block lists with CIDR and wildcard support
- Rate Limiting: Requests per minute with realistic LLM behavior
- Response Delay: Simulate realistic LLM response times for cached responses (configurable 0-30s range)
- Real-time request logging with metadata
- Exportable logs (CSV/JSON)
- Rolling metrics aggregation
- Cost tracking and token usage
- Dashboard: Live system stats and proxy monitoring
- Proxy Management: Full lifecycle control with visual status indicators
- Logs: Real-time streaming with advanced filtering
- Settings: Global configuration and security controls
- Stripe-inspired UI: Clean, modern, responsive design
- Email/password + social login (Google/GitHub)
- JWT-based authentication
- Per-user proxy isolation
- Email verification and password reset
- Python 3.11+
- Node.js 18+
- Git
git clone https://github.com/your-username/rubberduck.git cd rubberduck # Create virtual environment python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt # Initialize database ./scripts/fresh_install.sh# Development server python run.py # Or with custom host/port python run.py --host 0.0.0.0 --port 8000The backend will be available at:
- API: http://localhost:9000
- Documentation: http://localhost:9000/docs
- Health Check: http://localhost:9000/healthz
cd frontend # Install dependencies npm install # Start development server npm run devThe frontend will be available at: http://localhost:5173
- Open http://localhost:5173 in your browser
- Click "create a new account"
- Register with email and password
- Start creating LLM proxies!
- Web Interface: Use the "Create Proxy" button in the dashboard
- Configure: Set name, provider (OpenAI/Anthropic/etc.), model name
- Optional: Add description, tags, custom port
- Start: Click start to begin proxy on assigned port
Once a proxy is running, use it with any official LLM SDK by changing the base URL:
# OpenAI SDK Example import openai client = openai.OpenAI( api_key="your-openai-key", base_url="http://localhost:8001" # Your proxy port ) response = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": "Hello!"}] )// JavaScript SDK Example import OpenAI from 'openai'; const openai = new OpenAI({ apiKey: 'your-openai-key', baseURL: 'http://localhost:8001' // Your proxy port }); const response = await openai.chat.completions.create({ model: 'gpt-4', messages: [{ role: 'user', content: 'Hello!' }] });Configure failure simulation per proxy:
- Timeouts: Add artificial delays to test timeout handling
- Error Rates: Inject 429 (rate limit), 500 (server error), 400 (bad request)
- IP Filtering: Test geographic restrictions
- Rate Limiting: Simulate provider rate limits
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β Frontend β β Backend β β Database β β React + TS βββββΊβ FastAPI βββββΊβ SQLite β β Port 5173 β β Port 8000 β β β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β βΌ βββββββββββββββββββ β Proxy Instances β β Ports 8001+ β βββββββββββββββββββ β βΌ βββββββββββββββββββ β LLM Providers β β OpenAI, etc. β βββββββββββββββββββ rubberduck/ βββ src/rubberduck/ # Python backend β βββ auth/ # FastAPI Users authentication β βββ cache/ # Response caching system β βββ database/ # SQLite + SQLAlchemy β βββ failure/ # Failure simulation engine β βββ logging/ # Request logging middleware β βββ models/ # Database models β βββ providers/ # LLM provider modules β βββ proxy/ # Reverse proxy engine βββ frontend/ # React frontend β βββ src/components/ # Reusable UI components β βββ src/pages/ # Application pages β βββ src/contexts/ # React contexts β βββ src/utils/ # API client and utilities βββ tests/ # Test suites βββ docs/ # Documentation βββ data/ # SQLite database files # Run all backend tests python -m pytest # Run specific test categories python -m pytest tests/unit/ python -m pytest tests/integration/ # Test with coverage python -m pytest --cov=src/rubberduckcd frontend # Run tests in watch mode npm run test # Run tests once npm run test:run # Run tests with UI npm run test:ui- Proxy Lifecycle: Create, start, stop, configure proxies
- Authentication: Register, login, logout flows
- Failure Simulation: Test timeout, error injection, rate limiting
- Caching: Verify cache hits/misses
- Logging: Check request logging and export
# Install development dependencies pip install -r requirements-dev.txt # Run with auto-reload uvicorn src.rubberduck.main:app --reload --host 0.0.0.0 --port 8000 # Format code black src/ isort src/ # Type checking mypy src/cd frontend # Install dependencies npm install # Start dev server with hot reload npm run dev # Lint code npm run lint # Build for production npm run build- Create new provider module in
src/rubberduck/providers/ - Implement base provider interface
- Add to provider registry
- Update frontend provider options
curl http://localhost:8000/healthz- Proxy Status: Running/stopped count
- Cache Performance: Hit rates and response times
- Error Rates: Failed requests and error types
- Cost Tracking: Token usage and estimated costs
- Request Volume: RPM across all proxies
All requests are logged with:
- Timestamp and proxy ID
- Client IP and request hash
- Response status and latency
- Cache hit/miss status
- Token usage and cost (when available)
- Failure simulation details
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and add tests
- Ensure all tests pass:
npm run test && python -m pytest - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
- Backend: Follow PEP 8, use Black for formatting
- Frontend: Follow React best practices, use TypeScript strictly
- Testing: Write tests for all new features
- Documentation: Update README and code comments
This project is licensed under the MIT License - see the LICENSE file for details.
- FastAPI for the excellent Python web framework
- React and Tailwind CSS for the frontend
- FastAPI Users for authentication
- All the amazing LLM providers that make this possible
