Name	Name	Last commit message	Last commit date
parent directory ..
config	config
docs	docs
enrichment	enrichment
errors	errors
interceptors	interceptors
logging	logging
resilience	resilience
tests	tests
tools	tools
utils	utils
README.md	README.md

RAGE API - Retrieval Augmented Generation Enhancement

Overview

The RAGE API module provides intelligent context enrichment for LibreChat conversations by integrating with Vectorize.io to retrieve relevant information from organizational knowledge bases stored in Qdrant vector databases.

Features

Seamless Integration: Transparent context enrichment without user intervention
High Performance: Sub-500ms response times with 5-second timeout protection
Fault Tolerance: Graceful degradation when external services are unavailable
Zero Configuration: Works out-of-the-box with environment variables
Enterprise Ready: JWT authentication and audit logging

Quick Start

Prerequisites

LibreChat instance running
Vectorize.io account with API access
Qdrant vector database configured
JWT bearer token for authentication

Environment Variables

# Required - Core Settings RAGE_ENABLED=true # Required - Vectorize.io API Configuration RAGE_VECTORIZE_URI=https://api.vectorize.io/v1 RAGE_VECTORIZE_ORGANIZATION_ID=your_org_id RAGE_VECTORIZE_PIPELINE_ID=your_pipeline_id RAGE_VECTORIZE_API_KEY=your_jwt_token # Alternative environment variable names (legacy support) VECTORIZE_API_URL=https://api.vectorize.io/v1 VECTORIZE_ORG_ID=your_org_id VECTORIZE_PIPELINE_ID=your_pipeline_id VECTORIZE_JWT_TOKEN=your_jwt_token # Optional - Retrieval Settings RAGE_NUM_RESULTS=5 RAGE_RERANK=true RAGE_MIN_RELEVANCE_SCORE=0.7 # Optional - Performance Settings RAGE_TIMEOUT_MS=5000 RAGE_RETRY_ATTEMPTS=2 RAGE_RETRY_DELAY_MS=1000 RAGE_CACHE_TTL=300 # Optional - Debug and Logging RAGE_LOG_LEVEL=info RAGE_DEBUG=false # Optional - Advanced Settings RAGE_METRICS_ENABLED=true RAGE_CORRELATION_ID_PREFIX=rage RAGE_USER_AGENT=LibreChat-RAGE/1.0 # Optional - Feature Flags RAGE_ENABLE_CACHING=true RAGE_ENABLE_METRICS=true RAGE_ENABLE_AUDIT_LOG=false

Installation

The RAGE Interceptor is automatically loaded when environment variables are configured. No additional setup required.

Architecture

┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ LibreChat │────│ RAGE Interceptor │────│ Vectorize.io │ │ BaseClient │ │ │ │ API │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ ┌─────────────────┐ │ Qdrant │ │ Vector DB │ └─────────────────┘

Usage

Once configured, RAGE automatically enriches every conversation with relevant context:

User sends a message
RAGE intercepts the message
Retrieves relevant documents from vector database
Formats context for optimal LLM consumption
Enriched message is processed by LLM

Performance

Average Response Time: <500ms
Timeout Protection: 5 seconds maximum
Concurrent Requests: 100+ simultaneous
Memory Usage: <50MB per instance

Error Handling

RAGE is designed to fail gracefully:

Network timeouts → Continue without context
API errors → Log and proceed normally
Authentication failures → Retry with exponential backoff
Rate limiting → Intelligent throttling

Monitoring

Built-in logging and metrics collection:

Response time percentiles
Success/failure rates
Context relevance scores
Error categorization

Configuration

Configuration Reference

Variable	Required	Default	Description
Core Settings
`RAGE_ENABLED`	Yes	`false`	Enable/disable RAGE functionality
Vectorize.io API
`RAGE_VECTORIZE_URI`	Yes	-	Vectorize.io API base URL
`RAGE_VECTORIZE_ORGANIZATION_ID`	Yes	-	Organization GUID
`RAGE_VECTORIZE_PIPELINE_ID`	Yes	-	Pipeline GUID
`RAGE_VECTORIZE_API_KEY`	Yes	-	JWT authentication token
Retrieval Settings
`RAGE_NUM_RESULTS`	No	`5`	Maximum documents to retrieve (1-20)
`RAGE_RERANK`	No	`true`	Enable result reranking for relevance
`RAGE_MIN_RELEVANCE_SCORE`	No	`0.7`	Minimum relevance threshold (0.0-1.0)
Performance Settings
`RAGE_TIMEOUT_MS`	No	`5000`	API request timeout in milliseconds
`RAGE_RETRY_ATTEMPTS`	No	`2`	Number of retry attempts (0-5)
`RAGE_RETRY_DELAY_MS`	No	`1000`	Base delay between retries
`RAGE_CACHE_TTL`	No	`300`	Cache TTL in seconds (0=disabled)
Debug & Logging
`RAGE_LOG_LEVEL`	No	`info`	Log level (error,warn,info,debug,verbose)
`RAGE_DEBUG`	No	`false`	Enable debug mode
Advanced Settings
`RAGE_METRICS_ENABLED`	No	`true`	Enable metrics collection
`RAGE_CORRELATION_ID_PREFIX`	No	`rage`	Correlation ID prefix
`RAGE_USER_AGENT`	No	`LibreChat-RAGE/1.0`	API request user agent
Feature Flags
`RAGE_ENABLE_CACHING`	No	`true`	Enable response caching
`RAGE_ENABLE_METRICS`	No	`true`	Enable performance metrics
`RAGE_ENABLE_AUDIT_LOG`	No	`false`	Enable audit logging

Feature Flags

# Instant disable/enable RAGE_ENABLED=false # Development mode with verbose logging RAGE_DEBUG=true

Security

JWT tokens stored in environment variables only
HTTPS-only communication with external APIs
Input sanitization and validation
No sensitive data in logs
Audit trail for all API calls

Troubleshooting

Common Issues

RAGE not working

Check RAGE_ENABLED=true
Verify all required environment variables
Test JWT token validity

Slow responses

Check network connectivity to Vectorize.io
Monitor API response times
Adjust RAGE_TIMEOUT if needed

Empty context

Verify Qdrant database has indexed documents
Check relevance score threshold
Review search query formatting

Debug Mode

Enable detailed logging:

RAGE_DEBUG=true

Health Check

Test RAGE connectivity:

curl -H "Authorization: Bearer $RAGE_VECTORIZE_API_KEY" \ "$RAGE_VECTORIZE_URI/org/$RAGE_VECTORIZE_ORGANIZATION_ID/pipelines/$RAGE_VECTORIZE_PIPELINE_ID/health"

RAGE Query Tool

Test RAGE queries directly without LibreChat:

# Basic query test npm run rage:query "What is the company policy?" # JSON output format npm run rage:query "Employee handbook" --format json # Debug mode with verbose output  npm run rage:query "Benefits info" --debug --format verbose # Test with mock data (no API call required) npm run rage:query "test query" --mock

Available options:

--format - Output format: json, pretty, verbose
--debug - Enable debug mode with detailed request/response
--mock - Use mock responses for testing
--timeout <ms> - Request timeout in milliseconds
--max-results <n> - Maximum results to return
--help - Display help and usage examples

See Tools Documentation for complete usage guide.

Module Structure

The RAGE API is organized into the following modules:

rageapi/ ├── config/ # Configuration management │ ├── index.js # Main configuration manager │ ├── schema.js # Configuration schema and validation rules │ ├── validator.js # Configuration validation logic │ └── defaults.js # Default values and profiles ├── interceptors/ # Core interceptor implementation │ └── RageInterceptor.js # Main RAGE interceptor class ├── utils/ # Utility modules │ └── vectorizeClient.js # Vectorize.io API client ├── logging/ # Logging and monitoring │ ├── logger.js # RAGE-specific logger │ └── metrics.js # Performance metrics collection ├── resilience/ # Error handling and resilience │ ├── circuitBreaker.js # Circuit breaker pattern │ ├── retryHandler.js # Retry logic with backoff │ └── timeoutHandler.js # Request timeout management ├── enrichment/ # Context enrichment logic │ ├── contextProcessor.js # Context processing and formatting │ ├── relevanceScorer.js # Relevance scoring algorithms │ └── resultFormatter.js # Result formatting for LLM consumption ├── errors/ # Custom error types │ └── rageErrors.js # RAGE-specific error definitions ├── tests/ # Test suite │ ├── *.test.js # Unit tests │ ├── *.integration.test.js # Integration tests │ └── fixtures/ # Test data and mocks ├── tools/ # Development and testing tools │ ├── rage-query.js # CLI query testing tool │ └── lib/ # Tool support libraries └── docs/ # Additional documentation └── CONFIGURATION.md # Detailed configuration guide

Implementation Status

✅ Completed Features:

Core RAGE Interceptor with LibreChat integration
Configuration management with environment variable validation
Vectorize.io API client with JWT authentication
Error handling with circuit breaker and retry patterns
Context enrichment with relevance scoring
Logging and metrics collection with correlation IDs
Comprehensive test coverage (unit and integration)
CLI testing tool for development and debugging

🚀 Ready for Production:

All core functionality implemented and tested
Enterprise-grade error handling and monitoring
Security best practices with credential protection
Performance optimized with caching and timeouts
Comprehensive documentation and troubleshooting guides

Development

Testing

# Unit tests npm test -- rageapi/tests/ # Integration tests with LibreChat NODE_ENV=test npm test -- rageapi/tests/*.integration.test.js # Test RAGE interceptor specifically NODE_ENV=test npm test -- rageapi/tests/RageInterceptor.test.js # Enhanced integration tests NODE_ENV=test npx jest rageapi/tests/RageInterceptor.enhanced.test.js --no-coverage

Performance Testing

# Test RAGE query performance npm run rage:query "performance test query" --debug --format verbose # Resilience testing  NODE_ENV=test npm test -- rageapi/tests/resilience.test.js

Development Tools

# Test queries interactively npm run rage:query "What is our company policy?" --debug # Validate configuration node -e "console.log(require('./rageapi/config').configManager.initialize())" # Check RAGE status node -e "console.log(require('./rageapi/config').configManager.getSummary())"

License

Part of LibreChat project. See main LICENSE file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

RAGE API - Retrieval Augmented Generation Enhancement

Overview

Features

Quick Start

Prerequisites

Environment Variables

Installation

Architecture

Usage

Performance

Error Handling

Monitoring

Configuration

Configuration Reference

Feature Flags

Security

Troubleshooting

Common Issues

Debug Mode

Health Check

RAGE Query Tool

Module Structure

Implementation Status

Development

Testing

Performance Testing

Development Tools

License

FilesExpand file tree

rageapi

Directory actions

More options

Directory actions

More options

Latest commit

History

rageapi

Folders and files

parent directory

README.md

RAGE API - Retrieval Augmented Generation Enhancement

Overview

Features

Quick Start

Prerequisites

Environment Variables

Installation

Architecture

Usage

Performance

Error Handling

Monitoring

Configuration

Configuration Reference

Feature Flags

Security

Troubleshooting

Common Issues

Debug Mode

Health Check

RAGE Query Tool

Module Structure

Implementation Status

Development

Testing

Performance Testing

Development Tools

License