AI Analyst v2

An AI product analyst built on Claude Code. You ask a business question, it runs a pipeline of 18 agents that frame the question, explore your data, find the root cause, build a narrative, and hand you a validated slide deck with speaker notes. Minutes, not weeks.

18 specialized agents | 39 auto-applied skills | 20 slash commands | DAG-based parallel execution | PDF + HTML export

Before You Start

This is a tool for analysts, not a replacement for them. It handles about 80% of what a human analyst does. The 80% that takes all the time. But it only works if you're the expert.

You are the eval. Run this on data you know like the back of your hand. Run it on the reports you were already going to run this week. When it picks the wrong column or misinterprets a metric, you'll catch it immediately because you've written that query before. You correct it, it saves the correction, and it doesn't make that mistake again. That's the whole loop. Look, know, correct, move on.

Don't hand this to someone who can't validate the output. Don't run it on data you've never seen. The analyses it produces need your judgment before they go anywhere near a stakeholder. If you skip the validation, you'll get confident-sounding numbers that might be wrong. If you do the validation, you'll move faster than you ever have.

The byproduct of building this is the work itself. You're not taking time off from your job to set up an AI tool. You're doing your actual work through it. The first analysis takes a bit longer because you're connecting data and teaching it your context. By the third one, you're faster than doing it by hand. By next week, you're doing 15 analyses instead of 5.

This doesn't work out of the box. It's a starting point, not a finished product. The model capability is there with Opus 4.6, but you need to teach it your data, your metrics, your business context. Correct it when it's wrong. Grow it into something that works for your specific use case, or tear it apart and rebuild it how you want. The agents, skills, and pipeline are all markdown files you can read and modify. Nothing is hidden.

Bring your own data. No bundled datasets. Connect your CSVs, local databases, or cloud warehouse with /connect-data and start analyzing.

What's New in V2

V2 is a ground-up rebuild of the intelligence layer. The pipeline and agents from V1 still work the same way — you won't notice a difference in how you use it. What changed is everything underneath.

Area	V1	V2
Data	Bundled NovaMart e-commerce dataset	Bring your own — CSV, DuckDB, Postgres, BigQuery, Snowflake
Onboarding	Manual setup, read the docs	`/setup` interview learns your role, data, and business context
Memory	Stateless across sessions	Knowledge system persists corrections, learnings, query patterns, business glossary
Self-learning	None	Captures feedback, logs corrections, retrieves proven SQL patterns — never repeats the same mistake
Theming	Hardcoded chart style	YAML-based theme system with brand colors, WCAG-compliant palettes
Business context	None	Organization knowledge base — glossary, metrics, products, teams. Notion ingest.
Pipeline	Single run, restart on failure	Run tracking (`/runs`), reliable resume, comms drafter for Slack/email output
Testing	Minimal	606 tests with synthetic fixtures, no external data dependencies
Dataset coupling	NovaMart table names hardcoded in agents	Fully dataset-agnostic — agents resolve from active manifest and schema

Don't Know What to Do? Just Ask.

Claude knows the entire system — every agent, skill, command, and dataset. If you're stuck, ask it:

What can I do with this data? What should I run to refresh the deck? How do I connect my own CSV files? Which agents handle root cause analysis? Re-run just the chart maker and deck creator.

Claude will tell you the exact command. You don't need to memorize anything in this README. Think of it as a reference — Claude is the guide.

Quick Start

1. Install Claude Code (requires a Claude Pro subscription)

npm install -g @anthropic-ai/claude-code

2. Clone and set up

git clone https://github.com/ai-analyst-lab/ai-analyst.git cd ai-analyst pip install -e ".[dev]"

3. Start Claude Code

claude

4. Connect your data and go

/connect-data

Or skip the wizard and just ask a question with your data in a directory:

/run-pipeline data_path=data/my_csvs/ question="Why is conversion dropping?"

For full setup details: docs/setup-guide.md

Five Things You Can Do

1. Ask a quick question

What's our conversion rate by device?

Claude queries the data and returns an answer with a chart. Simple questions get answered in under 2 minutes without running the full pipeline.

2. Run a full analysis

/run-pipeline data_path=data/your_dataset/ question="What's driving the decline in conversion?"

The pipeline runs 18 agents across 4 phases: Frame the question, Analyze the data, Build the story, Create the deck. You get a validated analysis, branded charts, a narrative, and a slide deck with speaker notes. Exports to PDF and HTML.

3. Explore a dataset

/explore

Interactive data browsing without committing to a full analysis. Preview tables, check distributions, spot patterns, form hypotheses. Use /data users to inspect a specific table's schema.

4. Connect your own data

/connect-data

Guided wizard that walks you through connecting CSV files, local DuckDB, Postgres, BigQuery, or Snowflake. Auto-profiles your data, creates schema docs, and remembers your dataset context across sessions.

5. Make a single chart

Make a funnel chart of the checkout flow, highlighting the biggest drop-off step.

Claude generates a chart following Storytelling with Data methodology: warm off-white background, decluttered axes, action title, direct labels instead of legends.

How It Works: The Pipeline

When you run /run-pipeline, Claude orchestrates 18 agents across 4 phases:

1. FRAME 2. ANALYZE 3. STORY 4. DECK +-----------------+ +-----------------------------+ +--------------------+ +------------------+ | Question | | Data Explorer | | Story Architect | | Storytelling | | Framing | | > Source Tie-Out | | > Coherence | | > Deck Creator | | > Hypothesis | | > Descriptive Analytics | | Reviewer | | > Slide Review | | Generation | | > Root Cause Investigator | | > Chart Maker | | > Close the | | |-->| > Validation |-->| > Design Critic |-->| Loop | +-----------------+ | > Opportunity Sizer | +--------------------+ +------------------+ +-----------------------------+

Phase 1 — Frame: Structures your business question into analytical questions with testable hypotheses. Checkpoint: review the framing before analysis begins.

Phase 2 — Analyze: Explores the data, verifies loading integrity, runs segmentation/funnel/drivers analysis, drills down to root cause, validates findings, and sizes the opportunity. Checkpoint: automated quality gate.

Phase 3 — Story: Designs a storyboard (Context-Tension-Resolution arc), generates charts with collision detection, and reviews visual quality against a 16-point checklist.

Phase 4 — Deck: Writes a stakeholder narrative, builds a branded Marp slide deck with HTML components, reviews slide design, and ensures every recommendation has a follow-up plan. Exports to PDF and HTML.

You don't have to run the whole thing. Five execution plans let you run just the part you need:

Plan	Use When	What Runs
`full_presentation`	Complete analysis to slide deck	All 18 agents
`deep_dive`	Analysis without presentation	Phases 1-2 only
`quick_chart`	Just need one chart	Chart Maker + Design Critic
`refresh_deck`	Re-do the presentation layer	Phases 3-4 (reuses analysis)
`validate_only`	Check existing work	Validation + Source Tie-Out

/run-pipeline data_path=data/your_dataset/ question="..." plan=deep_dive

If the pipeline gets interrupted, resume where you left off:

/resume-pipeline

Preview what would run without executing:

/run-pipeline data_path=data/your_dataset/ question="..." dry-run=true

How It Works: The DAG Engine

The pipeline doesn't run agents one at a time. It resolves dependencies automatically and runs independent agents in parallel:

Tier 0 (parallel) Question Framing -----> Hypothesis Data Explorer --------> Source Tie-Out | Tier 2 (parallel) Descriptive Analytics / Overtime Trend / Cohort Analysis | Tier 3 (sequential) Root Cause --> Validation --> Opportunity Sizer | Tier 4 (sequential) Story Architect --> Coherence Review | Tier 5 (parallel fan-out) Chart Maker (per beat) --> Design Critic | Tier 6 (sequential) Storytelling --> Deck Creator --> Slide Review --> Close the Loop

Parallel execution: Agents in the same tier run concurrently (up to 3 at once). Tier 0 starts Question Framing and Data Explorer simultaneously.
Automatic dependency resolution: The engine reads agents/registry.yaml and computes execution tiers using topological sort.
Circuit breaker: If 3 agents fail in the same tier, the pipeline halts with a diagnostic report.
Timeouts: Each agent gets 5 minutes. One retry on timeout. Critical agents (source tie-out, validation) halt the pipeline; non-critical agents (design critic) degrade gracefully.
Checkpoints: Quality gates between phases. Two are automated (analysis verification, final deck lint). Two are user-facing (frame review, storyboard review). Say "just do it" to skip the user-facing ones.

All Commands

Command	What It Does	Example
`/run-pipeline`	Full analysis to slide deck	`/run-pipeline data_path=data/your_dataset/ question="Why is conversion dropping?"`
`/resume-pipeline`	Resume interrupted pipeline	`/resume-pipeline`
`/explore`	Interactive data exploration	`/explore events`
`/data`	Show active dataset schema	`/data users`
`/datasets`	List all connected datasets	`/datasets`
`/switch-dataset`	Change the active dataset	`/switch-dataset my_dataset`
`/connect-data`	Add a new data source	`/connect-data`
`/setup`	Interactive onboarding interview	`/setup`
`/metrics`	Browse the metric dictionary	`/metrics conversion_rate`
`/history`	View past analyses	`/history`
`/patterns`	View recurring patterns	`/patterns --global`
`/export`	Export results in various formats	`/export slides` or `/export email` or `/export slack`
`/forecast`	Generate a time-series forecast	`/forecast`
`/runs`	List, inspect, compare pipeline runs	`/runs`
`/business`	Browse organization knowledge	`/business glossary`
`/log-correction`	Log a data or methodology correction	`/log-correction`
`/architect`	Multi-persona planning methodology	`/architect`
`/notion-ingest`	Import business context from Notion	`/notion-ingest`
`/compare-datasets`	Compare metrics across datasets	`/compare-datasets`
`/setup-dev-context`	Add codebase context for dev teams	`/setup-dev-context`

Or just ask in plain English. "Show me conversion by device" works as well as any command.

Charts and Visualization

Every chart follows the Storytelling with Data methodology:

Your Data --> chart_helpers.py --> Base Chart (150 DPI) | Collision Check (3 fix strategies) | Marp Deck (HTML components) | marp_linter.py (8 check categories) | marp_export.py --> PDF + HTML

What happens automatically:

swd_style() applies warm off-white background (#F7F6F2), removes chart clutter (gridlines, borders, redundant legends), sets consistent typography
Every chart gets an action title (takeaway statement, not a label) and a subtitle (data source, time range)
Direct labels replace legends wherever possible
Collision detection checks for overlapping text with 3 auto-fix strategies: offset the label, reduce font size, or drop the least important label. Charts with unresolved collisions halt the pipeline.
The deck uses branded HTML components: KPI cards, finding cards, recommendation rows, so-what callouts, before/after panels, timelines, and more
A lint gate validates every deck before export: checks frontmatter completeness, HTML component usage (minimum 3 types), valid slide classes, slide count, and pacing
YAML-based theming with brand color overrides and WCAG-compliant palettes (see docs/theming.md)

Your Data

This repo ships clean — no bundled datasets. Connect your own data and the system builds context around it.

Connect your own

Run /connect-data for a guided setup wizard, or /setup for a full onboarding interview. Supported sources:

CSV files — drop them in a directory, point Claude at it
DuckDB — local or MotherDuck
Postgres — any Postgres-compatible database
BigQuery — Google BigQuery with service account
Snowflake — Snowflake with user/password or key pair

The system auto-profiles your data, creates schema documentation, notes data quirks, and remembers context across sessions in .knowledge/datasets/.

Example datasets

Curated public datasets with README guides are available in data/examples/.

Fallback chain

If your primary connection fails, the system falls back automatically:

Primary connection (e.g., MotherDuck via MCP)
Local DuckDB (from manifest.local_data.duckdb)
CSV files via pandas (from manifest.local_data.path)

You're always told which source is active.

What Just Happened? (Output Guide)

After running a pipeline, here's what you'll find:

outputs/ question_brief_YYYY-MM-DD.md # Your question, structured hypothesis_doc_YYYY-MM-DD.md # Testable hypotheses data_inventory_YYYY-MM-DD.md # What data exists analysis_report_YYYY-MM-DD.md # Full analysis with findings validation_<dataset>_YYYY-MM-DD.md # Independent validation of findings narrative_<dataset>_YYYY-MM-DD.md # Stakeholder-ready story deck_<dataset>_YYYY-MM-DD.marp.md # Slide deck (Marp source) deck_<dataset>_YYYY-MM-DD.pdf # PDF export deck_<dataset>_YYYY-MM-DD.html # HTML export (self-contained) close_the_loop_YYYY-MM-DD.md # Follow-up plan for recommendations charts/ # All generated charts working/ # Intermediate files (safe to delete) pipeline_state.json # Pipeline progress (for /resume-pipeline) pipeline_metrics.json # Execution timing and parallel efficiency storyboard_<dataset>.md # Story beats + visual mapping design_review_<dataset>.md # Chart quality review (16-point checklist) investigation_<dataset>.md # Root cause drill-down log sizing_*.md # Opportunity sizing with sensitivity analysis

outputs/ contains your deliverables. working/ contains intermediate artifacts that support resumability and debugging.

Customization

Want to...	Do this
Change how Claude thinks	Edit `CLAUDE.md` (the AI's persona, rules, workflow)
Add a new skill	Create `.claude/skills/my-skill/skill.md`, reference it in `CLAUDE.md`
Add a new agent	Create `agents/my-agent.md` using `agents/CONTRACT_TEMPLATE.md` as a starting point
Change the slide theme	Create a YAML theme in `themes/brands/` (see docs/theming.md)
Add deck components	Edit `templates/marp_components.md` (snippet library)
Modify the pipeline	Edit `.claude/skills/run-pipeline/skill.md` (rules, checkpoints, execution)
Add to the agent DAG	Edit `agents/registry.yaml` (dependencies, execution order)

All 18 Agents (click to expand)

Agents are markdown prompt templates in the agents/ directory. Each defines a multi-step workflow with {{VARIABLES}} that get filled in at runtime. To invoke one, ask Claude to run it or use /run-pipeline to orchestrate all of them.

Framing

Agent	What It Does	Pipeline Step
question-framing	Turns a business problem into structured analytical questions with hypotheses and data requirements	1
hypothesis	Generates testable hypotheses across cause categories: product changes, technical issues, external factors, mix shift	3

Data Discovery

Agent	What It Does	Pipeline Step
data-explorer	Profiles a dataset: schema, distributions, quality, gaps, supported analyses	4
source-tieout	Verifies data loaded correctly by comparing pandas vs DuckDB on row counts, nulls, and sums. Halts on mismatch.	4.5

Analysis

Agent	What It Does	Pipeline Step
descriptive-analytics	Segmentation, funnel analysis, and drivers analysis to identify what happened and why	5
overtime-trend	Time-series analysis: trends, anomalies, seasonality, annotated timeline charts	5
cohort-analysis	Retention curves, cohort comparison, vintage analysis, cohort LTV	5
root-cause-investigator	Iteratively drills down through dimensions to find the specific, actionable root cause	6
validation	4-layer verification: structural, logical, business rules, and Simpson's Paradox checks	7
opportunity-sizer	Quantifies business impact with sensitivity analysis showing which assumptions matter most	8

Storytelling

Agent	What It Does	Pipeline Step
story-architect	Designs a storyboard with Context-Tension-Resolution arc, maps beats to visual formats and HTML components	9
narrative-coherence-reviewer	Reviews the storyboard for story gaps, beat flow, and progressive depth before any charting	10
chart-maker	Generates SWD-styled charts with collision detection and action titles	12
visual-design-critic	Reviews charts against a 16-point SWD checklist plus 5 gotcha checks and 6 advanced technique checks. Also reviews slide-level deck design.	13/17

Presentation

Agent	What It Does	Pipeline Step
storytelling	Converts findings into a stakeholder-ready narrative with executive summary, findings, insight, and recommendations	15
deck-creator	Builds a branded Marp slide deck with HTML components, speaker notes, and correct theme styling	16
comms-drafter	Generates stakeholder communications: Slack summary, email brief, exec summary	19

Standalone

Agent	What It Does	Pipeline Step
experiment-designer	Designs A/B tests with power estimation, guardrail selection, and decision rules	(on demand)

All 39 Skills (click to expand)

Skills are instruction files in .claude/skills/ that Claude follows automatically when a trigger condition matches. You don't invoke them manually. When you ask for a chart, the Visualization Patterns skill activates. When you start an analysis, the Data Quality Check skill runs.

Always Active

These skills shape every interaction:

Skill	What It Does
analysis-design-spec	Ensures every analysis starts with a plan: question, decision, data needed, success criteria
close-the-loop	Every recommendation gets a decision owner, success metric, follow-up date, and fallback plan
data-quality-check	Validates data completeness and consistency before analysis begins
data-profiling	Deep-profiles schema, distributions, temporal patterns, and anomalies
feedback-capture	Captures user corrections and methodology guidance to the learnings system
first-run-welcome	Adaptive onboarding for new users based on available data
guardrails	Pairs every success metric with a guardrail metric; checks positive findings for trade-offs
knowledge-bootstrap	Loads active dataset context, schema, quirks, and user profile at session start
metric-spec	Standardized template for defining metrics with no ambiguity
question-framing	Structures vague business questions using the Question Ladder framework
question-router	Classifies questions L1-L5 and routes to the right response path
semantic-validation	4-layer validation stack plus confidence scoring
stakeholder-communication	Adapts findings to the audience: same insight, different framing
tracking-gaps	Identifies when required data doesn't exist and produces instrumentation requests
triangulation	Cross-references findings against multiple sources before presenting
visualization-patterns	Ensures every chart follows SWD design standards
archaeology	Retrieves proven SQL patterns from query archaeology before writing new queries

On-Demand (Slash Commands)

These activate when you use a command:

Skill	Command	What It Does
run-pipeline	`/run-pipeline`	End-to-end analysis with DAG execution, checkpoints, and export
resume-pipeline	`/resume-pipeline`	Resume interrupted work from last completed agent
explore	`/explore`	Quick interactive data exploration
export	`/export`	Export as slides, email, Slack message, or data
connect-data	`/connect-data`	Guided wizard to add a new dataset
switch-dataset	`/switch-dataset`	Change the active dataset
datasets	`/datasets`	List all connected datasets with status
data-inspect	`/data`	Show active schema, optionally drill into a table
metrics	`/metrics`	Browse and manage metric dictionary entries
history	`/history`	View past analyses from the archive
patterns	`/patterns`	View recurring patterns across analyses
forecast	`/forecast`	Generate time-series forecasts
compare-datasets	`/compare-datasets`	Compare metrics across two datasets
setup	`/setup`	Interactive onboarding interview for profile, data, and business context
setup-dev-context	`/setup-dev-context`	Add codebase context for dev teams
runs	`/runs`	List, inspect, compare, and clean up pipeline runs
business	`/business`	Browse organization knowledge (glossary, metrics, products, teams)
log-correction	`/log-correction`	Deliberate correction logging for methodology fixes
architect	`/architect`	Multi-persona planning methodology for new projects
notion-ingest	`/notion-ingest`	Crawl Notion workspace to extract business context

Presentation & Knowledge

Skill	What It Does
presentation-themes	Theme standards for slide decks: layouts, typography, color palettes
archive-analysis	Saves completed analyses to the knowledge system for future recall

All Helper Modules (click to expand)

Python modules in helpers/ that agents call during execution:

Charts and Visualization

Module	What It Does
`chart_helpers.py`	Core SWD charting: `swd_style()`, `highlight_bar()`, `highlight_line()`, `action_title()`, `annotate_point()`, `save_chart()`, `stacked_bar()`, `retention_heatmap()`, `sensitivity_table()`, `funnel_waterfall()`, `big_number_layout()`, `check_label_collisions()`
`chart_palette.py`	WCAG-compliant color palettes with brand override support
`chart_style_guide.md`	Full SWD reference: color palette, declutter checklist, chart decision tree, anti-patterns
`analytics_chart_style.mplstyle`	Matplotlib style file: off-white background, no top/right spines, sans-serif, 150 DPI
`marp_linter.py`	Validates Marp decks: frontmatter, HTML components, slide classes, pacing, title collisions
`marp_export.py`	Exports Marp decks to PDF and HTML via Marp CLI with theme resolution
`theme_loader.py`	YAML-based theme system with brand color loading and inheritance

Data and SQL

Module	What It Does
`data_helpers.py`	Data source abstraction: `detect_active_source()`, `check_connection()`, `read_table()`, `list_tables()`
`sql_helpers.py`	SQL sanity checks: join cardinality, percentage sums, date bounds, duplicates, temporal coverage
`sql_dialect.py`	SQL dialect router for Postgres, BigQuery, Snowflake, DuckDB
`connection_manager.py`	Unified interface for multi-warehouse connections
`tieout_helpers.py`	Source tie-out: dual-path comparison (pandas vs DuckDB) with tolerances
`schema_profiler.py`	Automated schema discovery and documentation

Analytics and Statistics

Module	What It Does
`analytics_helpers.py`	Analytical utilities for segmentation, decomposition, and driver analysis
`stats_helpers.py`	Statistical tests: proportion, mean, Mann-Whitney, chi-squared, bootstrap CI, effect size
`forecast_helpers.py`	Time-series forecasting with trend and seasonality detection
`deep_profiler.py`	Advanced data quality: distributions, correlations, completeness, anomalies

Validation

Module	What It Does
`structural_validator.py`	Layer 1: schema, primary keys, completeness checks
`logical_validator.py`	Layer 2: aggregation consistency, trend logic
`business_rules.py`	Layer 3: plausibility checks against domain rules
`business_validation.py`	Business rule validation against organization knowledge
`simpsons_paradox.py`	Layer 4: Simpson's Paradox scanner
`confidence_scoring.py`	Synthesizes all 4 layers into an A-F confidence grade

Knowledge & Context

Module	What It Does
`context_loader.py`	Loads active dataset context, schema, quirks at session start
`archaeology_helpers.py`	Query archaeology: retrieve and match proven SQL patterns
`business_context.py`	Organization knowledge: glossary, metrics, products, teams
`entity_resolver.py`	Disambiguates entity references across datasets
`metric_validator.py`	Validates metric definitions against schema
`schema_migration.py`	Handles schema version migrations for knowledge files
`miss_rate_logger.py`	Tracks knowledge system miss rates for improvement

System

Module	What It Does
`error_helpers.py`	Friendly error messages with suggestions
`file_helpers.py`	Atomic file writes, content hashing, safe YAML I/O
`health_check.py`	System health diagnostics for data connectivity and dependencies
`lineage_tracker.py`	Tracks data lineage from source through transformations to findings
`pipeline_state.py`	Pipeline state management for run tracking and resume

Requirements

Python 3.10+
Node.js 18+ (for Claude Code)
Claude Code with a Claude Pro subscription ($20/month)
Internet connection (for Claude API and optional MotherDuck)

Getting Help

Setup guide: docs/setup-guide.md
Theming: docs/theming.md
Questions or bugs: Open a GitHub Issue

License

MIT -- use it however you want.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.claude		.claude
.github		.github
.knowledge		.knowledge
agents		agents
connection_templates		connection_templates
data/examples		data/examples
docs		docs
helpers		helpers
outputs		outputs
scripts		scripts
shared		shared
templates		templates
tests		tests
themes		themes
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
.pre-commit-config.yaml		.pre-commit-config.yaml
BUILD_STATUS.yaml		BUILD_STATUS.yaml
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
HISTORY.md		HISTORY.md
LICENSE		LICENSE
README.md		README.md
data_sources.yaml		data_sources.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

AI Analyst v2

Before You Start

What's New in V2

Don't Know What to Do? Just Ask.

Quick Start

Five Things You Can Do

1. Ask a quick question

2. Run a full analysis

3. Explore a dataset

4. Connect your own data

5. Make a single chart

How It Works: The Pipeline

How It Works: The DAG Engine

All Commands

Charts and Visualization

Your Data

Connect your own

Example datasets

Fallback chain

What Just Happened? (Output Guide)

Customization

Framing

Data Discovery

Analysis

Storytelling

Presentation

Standalone

Always Active

On-Demand (Slash Commands)

Presentation & Knowledge

Charts and Visualization

Data and SQL

Analytics and Statistics

Validation

Knowledge & Context

System

Requirements

Getting Help

License

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 2

Languages

Packages