feat: cache tool schema token counts by danny-avila · Pull Request #12382 · danny-avila/LibreChat

danny-avila · 2026-03-24T17:36:55Z

Summary

Adds time-based caching (30min TTL) for tool schema token counts using the existing Keyv/Redis infrastructure, avoiding expensive recalculation on every agent run
Cache is keyed by {provider}:{fingerprint} where fingerprint = sorted tool names + count, so agents sharing the same tools share the cached value
New reusable utility module (packages/api/src/agents/toolTokens.ts) with getToolFingerprint, computeToolSchemaTokens, and getOrComputeToolTokens
buildAgentContext in createRun is now async with Promise.all for parallel cache lookups in multi-agent runs

Companion PR

Requires danny-avila/agents@8e0ff93 (@librechat/agents changes: toolSchemaTokens on AgentInputs, exported multiplier constants, fromConfig() short-circuit)

Test plan

Verify first agent run computes + caches tool tokens, second run hits cache
Verify tool set change (add/remove tool) causes cache miss and recomputation
Verify two agents sharing the same tools share the cached entry
Verify Anthropic and non-Anthropic providers cache independently (different multipliers)
Existing tests pass with no regressions

Add time-based caching (30min TTL) for tool schema token counts using the existing Keyv/Redis infrastructure. Cache is keyed by provider and a lightweight fingerprint (sorted tool names + count), so agents sharing the same tool set share the cached value. New utility module (toolTokens.ts) provides reusable functions: - getToolFingerprint: stable fingerprint from tool names - computeToolSchemaTokens: mirrors AgentContext.calculateInstructionTokens - getOrComputeToolTokens: cache lookup with compute-on-miss In createRun, buildAgentContext is now async with Promise.all for parallel cache lookups in multi-agent runs. Pre-computed tokens are passed via AgentInputs.toolSchemaTokens, skipping calculateInstructionTokens in @librechat/agents entirely on cache hit.

Copilot

Pull request overview

This PR introduces caching for tool schema token counts in the agents run pipeline to avoid repeated expensive token counting across runs/agents, using the existing Keyv/Redis cache infrastructure.

Changes:

Add a new cache namespace (CacheKeys.TOOL_TOKENS) with a 30-minute TTL.
Introduce packages/api/src/agents/toolTokens.ts to fingerprint tools, compute tool-schema token counts, and read/write cached values.
Make buildAgentContext async and resolve multi-agent inputs in parallel via Promise.all, passing toolSchemaTokens into AgentInputs.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File	Description
packages/data-provider/src/config.ts	Adds `CacheKeys.TOOL_TOKENS` constant for the new cache namespace.
packages/api/src/agents/toolTokens.ts	New utility for tool fingerprinting, token computation, and Keyv-backed caching.
packages/api/src/agents/run.ts	Computes (or fetches) `toolSchemaTokens` during agent context building; parallelizes context building.
api/cache/getLogStores.js	Registers a TOOL_TOKENS cache store in backend cache namespaces with 30-min TTL.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-24T17:41:36Z

packages/api/src/agents/toolTokens.ts

+ const cacheKey = `${provider}:${fingerprint}`;
+ const cache = getCache();
+
+ const cached = (await cache.get(cacheKey)) as number | undefined;
+ if (cached != null && cached > 0) {


Cache key only includes {provider}:{fingerprint}, but the computed token count also depends on getToolTokenMultiplier() which can vary based on clientOptions.model (e.g., OpenRouter Claude vs OpenRouter non-Claude). This can cause incorrect cache hits across different models under the same provider. Include the effective multiplier (or an isAnthropic/model discriminator) in the cache key so cached values can’t be reused with a different multiplier.

Copilot · 2026-03-24T17:41:37Z

packages/api/src/agents/toolTokens.ts

+ const cacheKey = `${provider}:${fingerprint}`;
+ const cache = getCache();
+
+ const cached = (await cache.get(cacheKey)) as number | undefined;


cache.get(cacheKey) is awaited without a try/catch. Since this cache is an optimization, a transient Keyv/Redis read failure should not fail the run; instead, fall back to computing tokens and optionally log at debug level. Wrap the read in a try/catch and treat read errors as cache misses.

Suggested change

const cached = (await cache.get(cacheKey)) as number | undefined;

let cached: number | undefined;

try {

cached = (await cache.get(cacheKey)) as number | undefined;

} catch {

/** Swallow cache read errors and treat as a cache miss */

cached = undefined;

}

Copilot · 2026-03-24T17:41:37Z

packages/api/src/agents/run.ts

 const agentInput: AgentInputs = {
 provider,
 reasoningKey,
 toolDefinitions,
+ toolSchemaTokens,
 agentId: agent.id,


This change adds toolSchemaTokens to AgentInputs, which requires the companion @librechat/agents update mentioned in the PR description. Since this PR doesn’t bump the @librechat/agents version / lockfile, it’s likely to break type-check/build until that dependency is updated. Consider updating the dependency version here (or gating usage) so the PR is self-contained.

Copilot · 2026-03-24T17:41:37Z

packages/api/src/agents/toolTokens.ts

+export async function getOrComputeToolTokens({
+ tools,
+ toolDefinitions,
+ provider,
+ clientOptions,
+ tokenCounter,
+}: {
+ tools?: GenericTool[];
+ toolDefinitions?: LCTool[];
+ provider: Providers;
+ clientOptions?: ClientOptions;
+ tokenCounter: TokenCounter;
+}): Promise<number> {


New caching/token-counting logic is introduced here but there are no unit tests covering fingerprinting, multiplier selection, cache hit/miss behavior, or error fallback. Since packages/api/src/agents already has Jest coverage, add targeted tests for getToolFingerprint, computeToolSchemaTokens, and getOrComputeToolTokens (including the provider/model multiplier split and cache read/write failure scenarios).

Copilot · 2026-03-24T17:41:37Z

api/cache/getLogStores.js

 CacheKeys.ADMIN_OAUTH_EXCHANGE,
 Time.THIRTY_SECONDS,
 ),
+ [CacheKeys.TOOL_TOKENS]: standardCache(CacheKeys.TOOL_TOKENS, Time.THIRTY_MINUTES),


This adds a TOOL_TOKENS store to getLogStores, but the new tool-token caching logic in packages/api/src/agents/toolTokens.ts instantiates its own standardCache(CacheKeys.TOOL_TOKENS, ...) instance and nothing in api/ currently calls getLogStores(CacheKeys.TOOL_TOKENS). In non-Redis mode this means the periodic in-memory TTL cleanup here won’t touch the cache instance actually used for tool token caching. Consider either removing this entry if unused, or centralizing TOOL_TOKENS cache construction so both paths share the same Keyv instance when USE_REDIS=false.

Suggested change

[CacheKeys.TOOL_TOKENS]: standardCache(CacheKeys.TOOL_TOKENS, Time.THIRTY_MINUTES),

Copilot AI review requested due to automatic review settings March 24, 2026 17:36

Copilot started reviewing on behalf of danny-avila March 24, 2026 17:37 View session

Copilot AI reviewed Mar 24, 2026

View reviewed changes

danny-avila marked this pull request as draft March 24, 2026 17:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: cache tool schema token counts#12382

feat: cache tool schema token counts#12382
danny-avila wants to merge 1 commit intodevfrom
claude/wizardly-lichterman

danny-avila commented Mar 24, 2026

Copilot AI left a comment

Copilot AI Mar 24, 2026

Copilot AI Mar 24, 2026

Copilot AI Mar 24, 2026

Copilot AI Mar 24, 2026

Copilot AI Mar 24, 2026

Labels

2 participants

- const cached = (await cache.get(cacheKey)) as number | undefined;
+ let cached: number | undefined;
+ try {
+ cached = (await cache.get(cacheKey)) as number | undefined;
+ } catch {
+ /** Swallow cache read errors and treat as a cache miss */
+ cached = undefined;
+ }

Uh oh!

Conversation

danny-avila commented Mar 24, 2026

Summary

Companion PR

Test plan

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Labels

2 participants