GitHub - pinexai/tokenspy: cProfile for LLMs — find which function burns your AI budget. Flame graphs, tracing, evals, prompt versioning. pip install tokenspy

Docs · Tracing · Evals · Dashboard · Issues

pip install tokenspy

🔥 The Problem

You get an OpenAI invoice for $800 this month. You have no idea which function caused it.

Langfuse and Braintrust require you to sign up, configure API keys, and reroute traffic through their cloud proxy just to see what's happening. tokenspy intercepts in-process — one decorator, no proxy, no account, no monthly fee.

⚡ Fix It in One Line

import tokenspy @tokenspy.profile def run_pipeline(query): docs = fetch_and_summarize(query) # ← costs $0.038? entities = extract_entities(docs) # ← or this? return generate_report(entities) # ← or this? run_pipeline("Analyze Q3 earnings") tokenspy.report()

╔══════════════════════════════════════════════════════════════════════╗ ║ tokenspy cost report ║ ║ total: $0.0523 · 18,734 tokens · 3 calls ║ ╠══════════════════════════════════════════════════════════════════════╣ ║ fetch_and_summarize $0.038 ████████████░░░░ 73% ║ ║ └─ gpt-4o $0.038 ████████████░░░░ 73% ║ ║ generate_report $0.011 ████░░░░░░░░░░░░ 21% ║ ║ extract_entities $0.003 █░░░░░░░░░░░░░░░ 6% ║ ╠══════════════════════════════════════════════════════════════════════╣ ║ 🔴 fetch_and_summarize → switch to gpt-4o-mini: 94% cheaper ║ ╚══════════════════════════════════════════════════════════════════════╝

Now you know: fetch_and_summarize is burning 73% of your budget.

✨ Full Observability Stack — v0.2.0

Everything Langfuse and Braintrust do, without sending a single byte to the cloud.

Feature	v0.1	v0.2.0
Cost flame graph	✅	✅
Budget alerts	✅	✅
SQLite persistence	✅	✅
Structured tracing (Trace + Span)	❌	✅
OpenTelemetry export	❌	✅
Evaluations + datasets	❌	✅
Prompt versioning	❌	✅
Live web dashboard	❌	✅

🖥️ Live Dashboard

pip install tokenspy[server] tokenspy serve # → http://localhost:7234

5 tabs: Overview · Traces · Evaluations · Prompts · Settings — all your LLM data, local, real-time.

🚀 Quick Start

Tracing

tokenspy.init(persist=True) with tokenspy.trace("pipeline", input={"query": q}) as t: with tokenspy.span("retrieve") as s: docs = fetch(q); s.update(output=docs) with tokenspy.span("generate", span_type="llm") as s: answer = llm_call(docs) # ← auto-linked to span t.update(output=answer) t.score("quality", 0.9)

Evaluations

from tokenspy.eval import scorers ds = tokenspy.dataset("qa-golden") ds.add(input={"q": "Capital of France?"}, expected_output="Paris") exp = tokenspy.experiment("gpt4o-mini-v1", dataset="qa-golden", fn=my_fn, scorers=[scorers.exact_match]) exp.run().summary()

Prompt versioning

p = tokenspy.prompts.push("summarizer", "Summarize in {{style}}: {{text}}") p.compile(style="concise", text="...") tokenspy.prompts.set_production("summarizer", version=2)

Budget alerts

@tokenspy.profile(budget_usd=0.10, on_exceeded="raise") def strict_agent(query): ... # raises BudgetExceededError if cost > $0.10

🆚 tokenspy vs Langfuse vs Braintrust

	Langfuse	Braintrust	tokenspy
Requires cloud proxy	✅ yes	✅ yes	❌ no
Requires signup	✅ yes	✅ yes	❌ no
Data leaves your machine	✅ yes	✅ yes	❌ never
Works offline	❌ no	❌ no	✅ yes
Zero core dependencies	❌ no	❌ no	✅ yes
Structured tracing	✅ yes	✅ yes	✅ yes
Evaluations + datasets	✅ yes	✅ yes	✅ yes
LLM-as-judge scoring	✅ yes	✅ yes	✅ yes
Prompt versioning	✅ yes	✅ yes	✅ yes
OpenTelemetry export	⚡ partial	❌ no	✅ yes
Flame graph by function	❌ no	❌ no	✅ yes
`@decorator` API	❌ no	❌ no	✅ yes
Budget alerts	⚡ partial	⚡ partial	✅ yes
Git commit cost tracking	❌ no	❌ no	✅ yes
GitHub Actions cost diff	❌ no	❌ no	✅ yes
Monthly cost	$0–$250	$0–$300	free forever

🔌 Integrations

Provider	Auto-instrumented
OpenAI	`chat.completions.create` (sync + async + streaming)
Anthropic	`messages.create` (sync + async + streaming)
Google Gemini	`generate_content`
LangChain / LangGraph	Callback handler

Exports: OpenTelemetry → Grafana, Jaeger, Datadog, Honeycomb (docs)

CI: GitHub Actions cost diff per PR (docs)

📦 Install

pip install tokenspy # core (zero dependencies) pip install tokenspy[otel] # + OpenTelemetry export pip install tokenspy[server] # + web dashboard (fastapi + uvicorn) pip install tokenspy[all] # openai + anthropic + langchain

📚 Documentation

→ Full documentation at pinakimishra95.github.io/tokenspy

Guide	Description
Tracing	Trace + Span context managers, auto LLM linking, scores
Evaluations & Datasets	Datasets, scorers, llm_judge, experiment comparison
Prompt Versioning	push / pull / compile / set_production
Web Dashboard	Local dashboard, REST API
OpenTelemetry	OTEL export to Grafana, Jaeger, Datadog
GitHub Actions	Cost diff annotations per PR

🤝 Contributing

git clone https://github.com/pinakimishra95/tokenspy cd tokenspy && pip install -e ".[dev]" pytest tests/ # 139 tests, ~0.3s

Issues and PRs welcome — especially new provider support and pricing updates.

License

Everything Langfuse and Braintrust do. Zero cloud. Zero signup. Zero cost.

GitHub · PyPI · Docs · Issues

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
scripts		scripts
tests		tests
tokenspy		tokenspy
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Docs · Tracing · Evals · Dashboard · Issues

🔥 The Problem

⚡ Fix It in One Line

✨ Full Observability Stack — v0.2.0

🖥️ Live Dashboard

🚀 Quick Start

Tracing

Evaluations

Prompt versioning

Budget alerts

🆚 tokenspy vs Langfuse vs Braintrust

🔌 Integrations

📦 Install

📚 Documentation

🤝 Contributing

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Docs · Tracing · Evals · Dashboard · Issues

🔥 The Problem

⚡ Fix It in One Line

✨ Full Observability Stack — v0.2.0

🖥️ Live Dashboard

🚀 Quick Start

Tracing

Evaluations

Prompt versioning

Budget alerts

🆚 tokenspy vs Langfuse vs Braintrust

🔌 Integrations

📦 Install

📚 Documentation

🤝 Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages