Regression testing for AI agents. Golden baselines, CI/CD, LangGraph, CrewAI, OpenAI, Claude.
io.github.hidai25/evalview-mcp
https://github.com/hidai25/eval-view
STDIO
No auth required
Hosted endpoint — paste into any MCP client.
Configuration this server reads at startup.
OpenAI API key for LLM-as-judge output quality scoring. Optional — deterministic tool/sequence evaluation works without it.
Where to find authoritative docs and source for evalview-mcp.
Open MCP Agent Studio and connect this server to Claude, GPT, Gemini, DeepSeek and more — no install required.
Open Agent Studio