Resource Recommendation: Iris — MCP-native agent evaluation server

Display Name

Iris

Sub-Category

General

Primary Link

https://github.com/iris-eval/mcp-server

Author Name

iris-eval

Author Link

https://github.com/iris-eval

License

MIT

Description

MCP-native agent evaluation server that scores output quality, catches safety failures, and enforces cost budgets. Ships 12 deterministic eval rules across 4 categories (completeness, relevance, safety, cost) including PII detection, prompt injection scanning, and hallucination markers. Zero-config — add it to your MCP config and any compatible agent discovers it automatically. Includes a real-time web dashboard with trace visualization, hierarchical span trees with per-tool-call latency and token usage, and SQLite-backed storage. Works with Claude Desktop, Claude Code, Cursor, and Windsurf. Glama AAA rated. The eval layer that infrastructure monitoring misses — your observability sees 200 OK, but Iris sees the agent leaked a social security number in its response.

Validate Claims

Install and test with:

Add to Claude Code:
claude mcp add --transport stdio iris-eval -- npx @iris-eval/mcp-server

Or add to any MCP config (Claude Desktop, Cursor, etc.):

{ "mcpServers": { "iris-eval": { "command": "npx", "args": ["@iris-eval/mcp-server"] } } }

Launch the dashboard to verify:
npx @iris-eval/mcp-server --dashboard
Open http://localhost:6920

Specific Task(s)

Add Iris to your Claude Code MCP config, then run any agent task. Iris automatically logs the trace and evaluates the output. Open the dashboard at localhost:6920 to see trace trees, eval scores, cost breakdowns, and safety flags.

Specific Prompt(s)

After adding Iris as an MCP server, try any normal prompt. Iris works passively — it evaluates agent outputs without requiring special prompts. Then check the dashboard for results.

Recommendation Checklist

No prior submission for this resource
Repository is at least one week old
All links are working/verified
No other open issues for this resource
I am a human (not a bot) submitting this

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resource Recommendation: Iris — MCP-native agent evaluation server #1177

Display Name

Category

Sub-Category

Primary Link

Author Name

Author Link

License

Description

Validate Claims

Specific Task(s)

Specific Prompt(s)

Recommendation Checklist

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Resource Recommendation: Iris — MCP-native agent evaluation server #1177

Description

Display Name

Category

Sub-Category

Primary Link

Author Name

Author Link

License

Description

Validate Claims

Specific Task(s)

Specific Prompt(s)

Recommendation Checklist

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions