Conversation
Add time-based caching (30min TTL) for tool schema token counts using the existing Keyv/Redis infrastructure. Cache is keyed by provider and a lightweight fingerprint (sorted tool names + count), so agents sharing the same tool set share the cached value. New utility module (toolTokens.ts) provides reusable functions: - getToolFingerprint: stable fingerprint from tool names - computeToolSchemaTokens: mirrors AgentContext.calculateInstructionTokens - getOrComputeToolTokens: cache lookup with compute-on-miss In createRun, buildAgentContext is now async with Promise.all for parallel cache lookups in multi-agent runs. Pre-computed tokens are passed via AgentInputs.toolSchemaTokens, skipping calculateInstructionTokens in @librechat/agents entirely on cache hit.
There was a problem hiding this comment.
Pull request overview
This PR introduces caching for tool schema token counts in the agents run pipeline to avoid repeated expensive token counting across runs/agents, using the existing Keyv/Redis cache infrastructure.
Changes:
- Add a new cache namespace (
CacheKeys.TOOL_TOKENS) with a 30-minute TTL. - Introduce
packages/api/src/agents/toolTokens.tsto fingerprint tools, compute tool-schema token counts, and read/write cached values. - Make
buildAgentContextasync and resolve multi-agent inputs in parallel viaPromise.all, passingtoolSchemaTokensintoAgentInputs.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| packages/data-provider/src/config.ts | Adds CacheKeys.TOOL_TOKENS constant for the new cache namespace. |
| packages/api/src/agents/toolTokens.ts | New utility for tool fingerprinting, token computation, and Keyv-backed caching. |
| packages/api/src/agents/run.ts | Computes (or fetches) toolSchemaTokens during agent context building; parallelizes context building. |
| api/cache/getLogStores.js | Registers a TOOL_TOKENS cache store in backend cache namespaces with 30-min TTL. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const cacheKey = `${provider}:${fingerprint}`; | ||
| const cache = getCache(); | ||
| | ||
| const cached = (await cache.get(cacheKey)) as number | undefined; | ||
| if (cached != null && cached > 0) { |
There was a problem hiding this comment.
Cache key only includes {provider}:{fingerprint}, but the computed token count also depends on getToolTokenMultiplier() which can vary based on clientOptions.model (e.g., OpenRouter Claude vs OpenRouter non-Claude). This can cause incorrect cache hits across different models under the same provider. Include the effective multiplier (or an isAnthropic/model discriminator) in the cache key so cached values can’t be reused with a different multiplier.
| const cacheKey = `${provider}:${fingerprint}`; | ||
| const cache = getCache(); | ||
| | ||
| const cached = (await cache.get(cacheKey)) as number | undefined; |
There was a problem hiding this comment.
cache.get(cacheKey) is awaited without a try/catch. Since this cache is an optimization, a transient Keyv/Redis read failure should not fail the run; instead, fall back to computing tokens and optionally log at debug level. Wrap the read in a try/catch and treat read errors as cache misses.
| const cached = (await cache.get(cacheKey)) as number | undefined; | |
| let cached: number | undefined; | |
| try { | |
| cached = (await cache.get(cacheKey)) as number | undefined; | |
| } catch { | |
| /** Swallow cache read errors and treat as a cache miss */ | |
| cached = undefined; | |
| } |
| const agentInput: AgentInputs = { | ||
| provider, | ||
| reasoningKey, | ||
| toolDefinitions, | ||
| toolSchemaTokens, | ||
| agentId: agent.id, |
There was a problem hiding this comment.
This change adds toolSchemaTokens to AgentInputs, which requires the companion @librechat/agents update mentioned in the PR description. Since this PR doesn’t bump the @librechat/agents version / lockfile, it’s likely to break type-check/build until that dependency is updated. Consider updating the dependency version here (or gating usage) so the PR is self-contained.
| export async function getOrComputeToolTokens({ | ||
| tools, | ||
| toolDefinitions, | ||
| provider, | ||
| clientOptions, | ||
| tokenCounter, | ||
| }: { | ||
| tools?: GenericTool[]; | ||
| toolDefinitions?: LCTool[]; | ||
| provider: Providers; | ||
| clientOptions?: ClientOptions; | ||
| tokenCounter: TokenCounter; | ||
| }): Promise<number> { |
There was a problem hiding this comment.
New caching/token-counting logic is introduced here but there are no unit tests covering fingerprinting, multiplier selection, cache hit/miss behavior, or error fallback. Since packages/api/src/agents already has Jest coverage, add targeted tests for getToolFingerprint, computeToolSchemaTokens, and getOrComputeToolTokens (including the provider/model multiplier split and cache read/write failure scenarios).
| CacheKeys.ADMIN_OAUTH_EXCHANGE, | ||
| Time.THIRTY_SECONDS, | ||
| ), | ||
| [CacheKeys.TOOL_TOKENS]: standardCache(CacheKeys.TOOL_TOKENS, Time.THIRTY_MINUTES), |
There was a problem hiding this comment.
This adds a TOOL_TOKENS store to getLogStores, but the new tool-token caching logic in packages/api/src/agents/toolTokens.ts instantiates its own standardCache(CacheKeys.TOOL_TOKENS, ...) instance and nothing in api/ currently calls getLogStores(CacheKeys.TOOL_TOKENS). In non-Redis mode this means the periodic in-memory TTL cleanup here won’t touch the cache instance actually used for tool token caching. Consider either removing this entry if unused, or centralizing TOOL_TOKENS cache construction so both paths share the same Keyv instance when USE_REDIS=false.
| [CacheKeys.TOOL_TOKENS]: standardCache(CacheKeys.TOOL_TOKENS, Time.THIRTY_MINUTES), |
Summary
{provider}:{fingerprint}where fingerprint = sorted tool names + count, so agents sharing the same tools share the cached valuepackages/api/src/agents/toolTokens.ts) withgetToolFingerprint,computeToolSchemaTokens, andgetOrComputeToolTokensbuildAgentContextincreateRunis now async withPromise.allfor parallel cache lookups in multi-agent runsCompanion PR
Requires danny-avila/agents@8e0ff93 (
@librechat/agentschanges:toolSchemaTokensonAgentInputs, exported multiplier constants,fromConfig()short-circuit)Test plan