The RAGE API module provides intelligent context enrichment for LibreChat conversations by integrating with Vectorize.io to retrieve relevant information from organizational knowledge bases stored in Qdrant vector databases.
- Seamless Integration: Transparent context enrichment without user intervention
- High Performance: Sub-500ms response times with 5-second timeout protection
- Fault Tolerance: Graceful degradation when external services are unavailable
- Zero Configuration: Works out-of-the-box with environment variables
- Enterprise Ready: JWT authentication and audit logging
- LibreChat instance running
- Vectorize.io account with API access
- Qdrant vector database configured
- JWT bearer token for authentication
# Required - Core Settings RAGE_ENABLED=true # Required - Vectorize.io API Configuration RAGE_VECTORIZE_URI=https://api.vectorize.io/v1 RAGE_VECTORIZE_ORGANIZATION_ID=your_org_id RAGE_VECTORIZE_PIPELINE_ID=your_pipeline_id RAGE_VECTORIZE_API_KEY=your_jwt_token # Alternative environment variable names (legacy support) VECTORIZE_API_URL=https://api.vectorize.io/v1 VECTORIZE_ORG_ID=your_org_id VECTORIZE_PIPELINE_ID=your_pipeline_id VECTORIZE_JWT_TOKEN=your_jwt_token # Optional - Retrieval Settings RAGE_NUM_RESULTS=5 RAGE_RERANK=true RAGE_MIN_RELEVANCE_SCORE=0.7 # Optional - Performance Settings RAGE_TIMEOUT_MS=5000 RAGE_RETRY_ATTEMPTS=2 RAGE_RETRY_DELAY_MS=1000 RAGE_CACHE_TTL=300 # Optional - Debug and Logging RAGE_LOG_LEVEL=info RAGE_DEBUG=false # Optional - Advanced Settings RAGE_METRICS_ENABLED=true RAGE_CORRELATION_ID_PREFIX=rage RAGE_USER_AGENT=LibreChat-RAGE/1.0 # Optional - Feature Flags RAGE_ENABLE_CACHING=true RAGE_ENABLE_METRICS=true RAGE_ENABLE_AUDIT_LOG=falseThe RAGE Interceptor is automatically loaded when environment variables are configured. No additional setup required.
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ LibreChat │────│ RAGE Interceptor │────│ Vectorize.io │ │ BaseClient │ │ │ │ API │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ ┌─────────────────┐ │ Qdrant │ │ Vector DB │ └─────────────────┘ Once configured, RAGE automatically enriches every conversation with relevant context:
- User sends a message
- RAGE intercepts the message
- Retrieves relevant documents from vector database
- Formats context for optimal LLM consumption
- Enriched message is processed by LLM
- Average Response Time: <500ms
- Timeout Protection: 5 seconds maximum
- Concurrent Requests: 100+ simultaneous
- Memory Usage: <50MB per instance
RAGE is designed to fail gracefully:
- Network timeouts → Continue without context
- API errors → Log and proceed normally
- Authentication failures → Retry with exponential backoff
- Rate limiting → Intelligent throttling
Built-in logging and metrics collection:
- Response time percentiles
- Success/failure rates
- Context relevance scores
- Error categorization
| Variable | Required | Default | Description |
|---|---|---|---|
| Core Settings | |||
RAGE_ENABLED | Yes | false | Enable/disable RAGE functionality |
| Vectorize.io API | |||
RAGE_VECTORIZE_URI | Yes | - | Vectorize.io API base URL |
RAGE_VECTORIZE_ORGANIZATION_ID | Yes | - | Organization GUID |
RAGE_VECTORIZE_PIPELINE_ID | Yes | - | Pipeline GUID |
RAGE_VECTORIZE_API_KEY | Yes | - | JWT authentication token |
| Retrieval Settings | |||
RAGE_NUM_RESULTS | No | 5 | Maximum documents to retrieve (1-20) |
RAGE_RERANK | No | true | Enable result reranking for relevance |
RAGE_MIN_RELEVANCE_SCORE | No | 0.7 | Minimum relevance threshold (0.0-1.0) |
| Performance Settings | |||
RAGE_TIMEOUT_MS | No | 5000 | API request timeout in milliseconds |
RAGE_RETRY_ATTEMPTS | No | 2 | Number of retry attempts (0-5) |
RAGE_RETRY_DELAY_MS | No | 1000 | Base delay between retries |
RAGE_CACHE_TTL | No | 300 | Cache TTL in seconds (0=disabled) |
| Debug & Logging | |||
RAGE_LOG_LEVEL | No | info | Log level (error,warn,info,debug,verbose) |
RAGE_DEBUG | No | false | Enable debug mode |
| Advanced Settings | |||
RAGE_METRICS_ENABLED | No | true | Enable metrics collection |
RAGE_CORRELATION_ID_PREFIX | No | rage | Correlation ID prefix |
RAGE_USER_AGENT | No | LibreChat-RAGE/1.0 | API request user agent |
| Feature Flags | |||
RAGE_ENABLE_CACHING | No | true | Enable response caching |
RAGE_ENABLE_METRICS | No | true | Enable performance metrics |
RAGE_ENABLE_AUDIT_LOG | No | false | Enable audit logging |
# Instant disable/enable RAGE_ENABLED=false # Development mode with verbose logging RAGE_DEBUG=true- JWT tokens stored in environment variables only
- HTTPS-only communication with external APIs
- Input sanitization and validation
- No sensitive data in logs
- Audit trail for all API calls
RAGE not working
- Check
RAGE_ENABLED=true - Verify all required environment variables
- Test JWT token validity
Slow responses
- Check network connectivity to Vectorize.io
- Monitor API response times
- Adjust
RAGE_TIMEOUTif needed
Empty context
- Verify Qdrant database has indexed documents
- Check relevance score threshold
- Review search query formatting
Enable detailed logging:
RAGE_DEBUG=trueTest RAGE connectivity:
curl -H "Authorization: Bearer $RAGE_VECTORIZE_API_KEY" \ "$RAGE_VECTORIZE_URI/org/$RAGE_VECTORIZE_ORGANIZATION_ID/pipelines/$RAGE_VECTORIZE_PIPELINE_ID/health"Test RAGE queries directly without LibreChat:
# Basic query test npm run rage:query "What is the company policy?" # JSON output format npm run rage:query "Employee handbook" --format json # Debug mode with verbose output npm run rage:query "Benefits info" --debug --format verbose # Test with mock data (no API call required) npm run rage:query "test query" --mockAvailable options:
--format- Output format:json,pretty,verbose--debug- Enable debug mode with detailed request/response--mock- Use mock responses for testing--timeout <ms>- Request timeout in milliseconds--max-results <n>- Maximum results to return--help- Display help and usage examples
See Tools Documentation for complete usage guide.
The RAGE API is organized into the following modules:
rageapi/ ├── config/ # Configuration management │ ├── index.js # Main configuration manager │ ├── schema.js # Configuration schema and validation rules │ ├── validator.js # Configuration validation logic │ └── defaults.js # Default values and profiles ├── interceptors/ # Core interceptor implementation │ └── RageInterceptor.js # Main RAGE interceptor class ├── utils/ # Utility modules │ └── vectorizeClient.js # Vectorize.io API client ├── logging/ # Logging and monitoring │ ├── logger.js # RAGE-specific logger │ └── metrics.js # Performance metrics collection ├── resilience/ # Error handling and resilience │ ├── circuitBreaker.js # Circuit breaker pattern │ ├── retryHandler.js # Retry logic with backoff │ └── timeoutHandler.js # Request timeout management ├── enrichment/ # Context enrichment logic │ ├── contextProcessor.js # Context processing and formatting │ ├── relevanceScorer.js # Relevance scoring algorithms │ └── resultFormatter.js # Result formatting for LLM consumption ├── errors/ # Custom error types │ └── rageErrors.js # RAGE-specific error definitions ├── tests/ # Test suite │ ├── *.test.js # Unit tests │ ├── *.integration.test.js # Integration tests │ └── fixtures/ # Test data and mocks ├── tools/ # Development and testing tools │ ├── rage-query.js # CLI query testing tool │ └── lib/ # Tool support libraries └── docs/ # Additional documentation └── CONFIGURATION.md # Detailed configuration guide ✅ Completed Features:
- Core RAGE Interceptor with LibreChat integration
- Configuration management with environment variable validation
- Vectorize.io API client with JWT authentication
- Error handling with circuit breaker and retry patterns
- Context enrichment with relevance scoring
- Logging and metrics collection with correlation IDs
- Comprehensive test coverage (unit and integration)
- CLI testing tool for development and debugging
🚀 Ready for Production:
- All core functionality implemented and tested
- Enterprise-grade error handling and monitoring
- Security best practices with credential protection
- Performance optimized with caching and timeouts
- Comprehensive documentation and troubleshooting guides
# Unit tests npm test -- rageapi/tests/ # Integration tests with LibreChat NODE_ENV=test npm test -- rageapi/tests/*.integration.test.js # Test RAGE interceptor specifically NODE_ENV=test npm test -- rageapi/tests/RageInterceptor.test.js # Enhanced integration tests NODE_ENV=test npx jest rageapi/tests/RageInterceptor.enhanced.test.js --no-coverage# Test RAGE query performance npm run rage:query "performance test query" --debug --format verbose # Resilience testing NODE_ENV=test npm test -- rageapi/tests/resilience.test.js# Test queries interactively npm run rage:query "What is our company policy?" --debug # Validate configuration node -e "console.log(require('./rageapi/config').configManager.initialize())" # Check RAGE status node -e "console.log(require('./rageapi/config').configManager.getSummary())"Part of LibreChat project. See main LICENSE file.