Generate synthetic QA datasets and evaluate RAG system performance with privacy-first evaluation. Works with any LLM provider (OpenAI, Anthropic, local) and runs locally or in the cloud.
io.github.HZYAI/ragscore
Local install
STDIO
No auth required
How models use it and what it is built for.
Generate synthetic QA datasets and evaluate RAG system performance with privacy-first evaluation. Works with any LLM provider (OpenAI, Anthropic, local) and runs locally or in the cloud.
Local install — runs as a subprocess.
Configuration this server reads at startup.
OpenAI API key (if using OpenAI provider)
Anthropic API key (if using Anthropic provider)
Where to find authoritative docs and source for RAGScore.
Paste any of these into Agent Studio after connecting RAGScore.
Common questions about connecting and running RAGScore.
What LLM providers does ragscore support?
Ragscore supports OpenAI and Anthropic via API keys, plus local LLM providers. Set OPENAI_API_KEY or ANTHROPIC_API_KEY as environment variables to use those services.
Is my data private when using ragscore?
Yes, ragscore is privacy-first and can run entirely locally. You control whether data goes to cloud providers by choosing your LLM backend and running the server on your own infrastructure.
How do I install and run ragscore?
Install via `uvx ragscore@0.6.10` and connect it as an MCP server to your chat interface. Set required API keys (OPENAI_API_KEY or ANTHROPIC_API_KEY) as environment variables if using cloud providers.
What can I evaluate with ragscore?
Ragscore generates synthetic QA datasets and evaluates RAG system performance, including retrieval quality and answer accuracy. It's designed to test and benchmark retrieval-augmented generation pipelines.
Can I use ragscore without an API key?
Yes, if you configure a local LLM provider. The OPENAI_API_KEY and ANTHROPIC_API_KEY environment variables are only required if you want to use those cloud services.
MCP Playground runs 10,000+ hosted MCP servers — GitHub, Linear, Notion, Stripe, Sentry and more — across Claude, GPT, Gemini, DeepSeek and 30+ AI models. Compare model answers side-by-side, save agent presets, share runs. Zero install.
Open Agent Studio