MCP ServerSTDIOOfficialv2.2.2

mcp-context-server MCP Server

An MCP server that provides persistent multimodal context storage for LLM agents.

io.github.alex-feel/mcp-context-server

Hosted URL

https://github.com/alex-feel/mcp-context-server

Transport

STDIO

Auth

No auth required

Connect to mcp-context-server

Hosted endpoint — paste into any MCP client.

https://github.com/alex-feel/mcp-context-server

Environment variables

Configuration this server reads at startup.

  • LOG_LEVEL

    Log level

  • STORAGE_BACKEND

    Storage backend type: sqlite (default) or postgresql

  • MAX_IMAGE_SIZE_MB

    Maximum individual image size in megabytes

  • MAX_TOTAL_SIZE_MB

    Maximum total request size in megabytes

  • DB_PATH

    Custom database file location path

  • POOL_MAX_READERS

    Maximum number of concurrent read connections in the pool

  • POOL_MAX_WRITERS

    Maximum number of concurrent write connections in the pool

  • POOL_CONNECTION_TIMEOUT_S

    Connection timeout in seconds

  • POOL_IDLE_TIMEOUT_S

    Idle connection timeout in seconds

  • POOL_HEALTH_CHECK_INTERVAL_S

    Connection health check interval in seconds

  • RETRY_MAX_RETRIES

    Maximum number of retry attempts for failed operations

  • RETRY_BASE_DELAY_S

    Base delay in seconds between retry attempts

  • RETRY_MAX_DELAY_S

    Maximum delay in seconds between retry attempts

  • RETRY_JITTER

    Enable random jitter in retry delays

  • RETRY_BACKOFF_FACTOR

    Exponential backoff multiplication factor for retries

  • SQLITE_FOREIGN_KEYS

    Enable SQLite foreign key constraints

  • SQLITE_JOURNAL_MODE

    SQLite journal mode (e.g., WAL, DELETE)

  • SQLITE_SYNCHRONOUS

    SQLite synchronous mode (e.g., NORMAL, FULL, OFF)

  • SQLITE_TEMP_STORE

    SQLite temporary storage location (e.g., MEMORY, FILE)

  • SQLITE_MMAP_SIZE

    SQLite memory-mapped I/O size in bytes

  • SQLITE_CACHE_SIZE

    SQLite cache size (negative value for KB, positive for pages)

  • SQLITE_PAGE_SIZE

    SQLite page size in bytes

  • SQLITE_WAL_AUTOCHECKPOINT

    SQLite WAL autocheckpoint threshold in pages

  • SQLITE_BUSY_TIMEOUT_MS

    SQLite busy timeout in milliseconds

  • SQLITE_WAL_CHECKPOINT

    SQLite WAL checkpoint mode (e.g., PASSIVE, FULL, RESTART)

  • SHUTDOWN_TIMEOUT_S

    Server shutdown timeout in seconds

  • SHUTDOWN_TIMEOUT_TEST_S

    Test mode shutdown timeout in seconds

  • QUEUE_TIMEOUT_S

    Queue operation timeout in seconds

  • QUEUE_TIMEOUT_TEST_S

    Test mode queue timeout in seconds

  • CIRCUIT_BREAKER_FAILURE_THRESHOLD

    Circuit breaker failure threshold before opening

  • CIRCUIT_BREAKER_RECOVERY_TIMEOUT_S

    Circuit breaker recovery timeout in seconds

  • CIRCUIT_BREAKER_HALF_OPEN_MAX_CALLS

    Maximum calls allowed in circuit breaker half-open state

  • POSTGRESQL_CONNECTION_STRINGSecret

    Complete PostgreSQL connection string (overrides individual settings if provided)

  • POSTGRESQL_HOST

    PostgreSQL server host address

  • POSTGRESQL_PORT

    PostgreSQL server port number

  • POSTGRESQL_USER

    PostgreSQL database username

  • POSTGRESQL_PASSWORDSecret

    PostgreSQL database password

  • POSTGRESQL_DATABASE

    PostgreSQL database name

  • POSTGRESQL_POOL_MIN

    PostgreSQL connection pool minimum size

  • POSTGRESQL_POOL_MAX

    PostgreSQL connection pool maximum size

  • POSTGRESQL_POOL_TIMEOUT_S

    PostgreSQL connection pool timeout in seconds

  • POSTGRESQL_COMMAND_TIMEOUT_S

    PostgreSQL command execution timeout in seconds

  • POSTGRESQL_MIGRATION_TIMEOUT_S

    Timeout in seconds for PostgreSQL migration operations (default: 300)

  • POSTGRESQL_MAX_INACTIVE_LIFETIME_S

    Close idle PostgreSQL connections after this many seconds (0 to disable, default: 300)

  • POSTGRESQL_MAX_QUERIES

    Recycle PostgreSQL connections after this many queries (0 to disable, default: 10000)

  • POSTGRESQL_TCP_KEEPALIVES_IDLE_S

    Seconds of idle time before sending first TCP keepalive probe (0 to disable, default: 15)

  • POSTGRESQL_TCP_KEEPALIVES_INTERVAL_S

    Seconds between subsequent TCP keepalive probes (0 to disable, default: 5)

  • POSTGRESQL_TCP_KEEPALIVES_COUNT

    Number of failed TCP keepalive probes before connection is considered dead (0 to disable, default: 3)

  • POSTGRESQL_STATEMENT_CACHE_SIZE

    asyncpg prepared statement cache size. Set to 0 for external pooler compatibility (PgBouncer transaction mode, Pgpool-II, etc.). Default: 100

  • POSTGRESQL_MAX_CACHED_STATEMENT_LIFETIME_S

    Maximum lifetime of cached prepared statements in seconds (default: 300). Has no effect when statement_cache_size=0

  • POSTGRESQL_MAX_CACHEABLE_STATEMENT_SIZE

    Maximum size of statement to cache in bytes (default: 15360). Has no effect when statement_cache_size=0

  • POSTGRESQL_SSL_MODE

    PostgreSQL SSL mode (disable, allow, prefer, require, verify-ca, verify-full)

  • POSTGRESQL_SCHEMA

    PostgreSQL schema name for table and index operations (default: public)

  • ENABLE_SEMANTIC_SEARCH

    Enable semantic search functionality

  • ENABLE_EMBEDDING_GENERATION

    Enable embedding generation for stored context. Default true - server fails if dependencies not met. Set false to disable embeddings.

  • OLLAMA_HOST

    Ollama API host URL for embedding generation

  • OLLAMA_AUTO_PULL

    Automatically pull missing Ollama models on startup (default: true)

  • OLLAMA_PULL_TIMEOUT_S

    Timeout in seconds for pulling Ollama models (default: 900, range: 30-3600)

  • EMBEDDING_OLLAMA_TRUNCATE

    Ollama embedding truncation mode: false (default) returns error when context exceeded, true enables silent truncation

  • EMBEDDING_OLLAMA_NUM_CTX

    Ollama embedding context window size in tokens (default: 4096, range: 512-2097152)

  • EMBEDDING_MODEL

    Embedding model name for semantic search

  • EMBEDDING_DIM

    Embedding vector dimensions

  • EMBEDDING_TIMEOUT_S

    Timeout in seconds for embedding generation API calls

  • EMBEDDING_RETRY_MAX_ATTEMPTS

    Maximum number of retry attempts for embedding generation

  • EMBEDDING_RETRY_BASE_DELAY_S

    Base delay in seconds between retry attempts (with exponential backoff)

  • EMBEDDING_MAX_CONCURRENT

    Maximum concurrent embedding generation operations (default: 3, range: 1-20)

  • ENABLE_SUMMARY_GENERATION

    Enable summary generation for stored context. Default true - server fails if dependencies not met. Set false to disable summaries.

  • SUMMARY_PROVIDER

    Summary provider: ollama (default), openai, or anthropic

  • SUMMARY_MODEL

    Summary generation model name (default: qwen3:0.6b)

  • SUMMARY_MAX_TOKENS

    Maximum output tokens for summary generation (default: 4000, range: 50-16384). Increase if summaries are truncated by reasoning models

  • SUMMARY_TIMEOUT_S

    Timeout in seconds for summary generation API calls

  • SUMMARY_RETRY_MAX_ATTEMPTS

    Maximum number of retry attempts for summary generation

  • SUMMARY_RETRY_BASE_DELAY_S

    Base delay in seconds between retry attempts (with exponential backoff)

  • SUMMARY_MAX_CONCURRENT

    Maximum concurrent summary generation operations (default: 3, range: 1-20)

  • SUMMARY_PROMPT

    Custom summarization prompt. Overrides the built-in default. Used as system message for the LLM.

  • SUMMARY_MIN_CONTENT_LENGTH

    Minimum text content length in characters to trigger summary generation (default: 500, range: 0-10000). Set to 0 to always generate.

  • SUMMARY_OLLAMA_NUM_CTX

    Ollama summary context window size in tokens (default: 32768, range: 512-2097152)

  • SUMMARY_OLLAMA_TRUNCATE

    Ollama summary truncation mode: false (default) returns error when context exceeded, true enables silent truncation

  • SUMMARY_OPENAI_REASONING_EFFORT

    Reasoning effort level for OpenAI reasoning models (default: low). Valid values vary by generation: gpt-5: low, medium, high; gpt-5.1+: none, low, medium, high, xhigh. Default low is universally valid across all generations

  • SUMMARY_ANTHROPIC_EFFORT

    Effort level for Anthropic Claude models (default: none). Valid values: max, high, medium, low. Controls inference effort (adaptive thinking)

  • ANTHROPIC_API_KEYSecret

    Anthropic API key for summary generation

  • ENABLE_FTS

    Enable full-text search functionality

  • FTS_LANGUAGE

    Language for FTS stemming (e.g., english, german, french)

  • FTS_RERANK_WINDOW_SIZE

    Characters of context around each FTS match for reranking passage extraction (default: 750)

  • FTS_RERANK_GAP_MERGE

    Merge FTS match regions within this character distance (default: 100)

  • ENABLE_HYBRID_SEARCH

    Enable hybrid search combining FTS and semantic search with RRF fusion

  • HYBRID_RRF_K

    RRF smoothing constant for hybrid search (default 60)

  • HYBRID_RRF_OVERFETCH

    Multiplier for over-fetching results before RRF fusion (default: 2)

  • HYBRID_FTS_OR_THRESHOLD

    Minimum significant query terms to switch hybrid FTS from AND to OR logic (default: 4)

  • SEARCH_DEFAULT_SORT_BY

    Default sort order for search results: relevance (only 'relevance' supported in current version)

  • SEARCH_TRUNCATION_LENGTH

    Maximum character length for truncated text_content in search results (default: 300, range: 50-1000)

  • ENABLE_CHUNKING

    Enable text chunking for embedding generation (default: true)

  • CHUNK_SIZE

    Target chunk size in characters (default: 1500)

  • CHUNK_OVERLAP

    Overlap between chunks in characters (default: 150)

  • CHUNK_AGGREGATION

    Chunk score aggregation method: max (only 'max' supported in current version)

  • CHUNK_DEDUP_OVERFETCH

    Multiplier for over-fetching chunks before deduplication (default: 5)

  • ENABLE_RERANKING

    Enable cross-encoder reranking of search results (default: true)

  • RERANKING_PROVIDER

    Reranking provider (default: flashrank)

  • RERANKING_MODEL

    Reranking model name (default: ms-marco-MiniLM-L-12-v2)

  • RERANKING_MAX_LENGTH

    Maximum input length for reranking in tokens (default: 512)

  • RERANKING_OVERFETCH

    Multiplier for over-fetching results before reranking (default: 4)

  • RERANKING_CACHE_DIR

    Directory for caching reranking models

  • RERANKING_CHARS_PER_TOKEN

    Estimated characters per token for passage size validation (default: 4.0, range: 2.0-8.0)

  • RERANKING_INTRA_OP_THREADS

    ONNX Runtime intra-operation parallelism threads for reranking (default: 0 = auto-detect)

  • RERANKING_CPU_MEM_ARENA

    Enable ONNX Runtime CPU memory arena for reranking (default: false)

  • RERANKING_BATCH_SIZE

    Maximum passages per ONNX Runtime inference batch during reranking (default: 32)

  • EMBEDDING_PROVIDER

    Embedding provider: ollama (default), openai, azure, huggingface, or voyage

  • OPENAI_API_KEYSecret

    OpenAI API key for OpenAI embedding provider

  • OPENAI_API_BASE

    Custom base URL for OpenAI-compatible APIs

  • OPENAI_ORGANIZATION

    OpenAI organization ID

  • AZURE_OPENAI_API_KEYSecret

    Azure OpenAI API key

  • AZURE_OPENAI_ENDPOINT

    Azure OpenAI endpoint URL

  • AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME

    Azure OpenAI embedding deployment name

  • AZURE_OPENAI_API_VERSION

    Azure OpenAI API version (default: 2024-02-01)

  • HUGGINGFACEHUB_API_TOKENSecret

    HuggingFace Hub API token for HuggingFace embedding provider

  • VOYAGE_API_KEYSecret

    Voyage AI API key for Voyage embedding provider

  • VOYAGE_TRUNCATION

    Voyage AI truncation mode: false (default) returns error when context exceeded, true enables silent truncation

  • VOYAGE_BATCH_SIZE

    Voyage AI batch size for embedding requests

  • LANGSMITH_TRACING

    Enable LangSmith tracing

  • LANGSMITH_API_KEYSecret

    LangSmith API key

  • LANGSMITH_PROJECT

    LangSmith project name

  • LANGSMITH_ENDPOINT

    LangSmith API endpoint URL

  • METADATA_INDEXED_FIELDS

    Comma-separated list of metadata fields to index (field:type format)

  • METADATA_INDEX_SYNC_MODE

    Index sync mode: strict (fail), auto (sync), warn (log), additive (default, add missing only)

  • MCP_TRANSPORT

    Transport mode: stdio for local, http for Docker/remote

  • FASTMCP_HOST

    HTTP bind address (use 0.0.0.0 for Docker)

  • FASTMCP_PORT

    HTTP port number

  • FASTMCP_STATELESS_HTTP

    Enable stateless HTTP mode for horizontal scaling. Enabled by default as the server has no stateful MCP features. Set to false only if you need server-side MCP session tracking.

  • DISABLED_TOOLS

    Comma-separated list of tools to disable (e.g., delete_context,update_context)

  • MCP_AUTH_TOKENSecret

    Bearer token for HTTP authentication (required when using SimpleTokenVerifier)

  • MCP_AUTH_CLIENT_ID

    Client ID to assign to authenticated requests

  • MCP_AUTH_PROVIDER

    Authentication provider: none (default), simple_token

  • MCP_SERVER_INSTRUCTIONS

    Custom server instructions text. Overrides built-in default. Set to empty string to disable.

  • LOG_LEVEL

    Log level

  • STORAGE_BACKEND

    Storage backend type: sqlite (default) or postgresql

  • MAX_IMAGE_SIZE_MB

    Maximum individual image size in megabytes

  • MAX_TOTAL_SIZE_MB

    Maximum total request size in megabytes

  • DB_PATH

    Custom database file location path

  • POOL_MAX_READERS

    Maximum number of concurrent read connections in the pool

  • POOL_MAX_WRITERS

    Maximum number of concurrent write connections in the pool

  • POOL_CONNECTION_TIMEOUT_S

    Connection timeout in seconds

  • POOL_IDLE_TIMEOUT_S

    Idle connection timeout in seconds

  • POOL_HEALTH_CHECK_INTERVAL_S

    Connection health check interval in seconds

  • RETRY_MAX_RETRIES

    Maximum number of retry attempts for failed operations

  • RETRY_BASE_DELAY_S

    Base delay in seconds between retry attempts

  • RETRY_MAX_DELAY_S

    Maximum delay in seconds between retry attempts

  • RETRY_JITTER

    Enable random jitter in retry delays

  • RETRY_BACKOFF_FACTOR

    Exponential backoff multiplication factor for retries

  • SQLITE_FOREIGN_KEYS

    Enable SQLite foreign key constraints

  • SQLITE_JOURNAL_MODE

    SQLite journal mode (e.g., WAL, DELETE)

  • SQLITE_SYNCHRONOUS

    SQLite synchronous mode (e.g., NORMAL, FULL, OFF)

  • SQLITE_TEMP_STORE

    SQLite temporary storage location (e.g., MEMORY, FILE)

  • SQLITE_MMAP_SIZE

    SQLite memory-mapped I/O size in bytes

  • SQLITE_CACHE_SIZE

    SQLite cache size (negative value for KB, positive for pages)

  • SQLITE_PAGE_SIZE

    SQLite page size in bytes

  • SQLITE_WAL_AUTOCHECKPOINT

    SQLite WAL autocheckpoint threshold in pages

  • SQLITE_BUSY_TIMEOUT_MS

    SQLite busy timeout in milliseconds

  • SQLITE_WAL_CHECKPOINT

    SQLite WAL checkpoint mode (e.g., PASSIVE, FULL, RESTART)

  • SHUTDOWN_TIMEOUT_S

    Server shutdown timeout in seconds

  • SHUTDOWN_TIMEOUT_TEST_S

    Test mode shutdown timeout in seconds

  • QUEUE_TIMEOUT_S

    Queue operation timeout in seconds

  • QUEUE_TIMEOUT_TEST_S

    Test mode queue timeout in seconds

  • CIRCUIT_BREAKER_FAILURE_THRESHOLD

    Circuit breaker failure threshold before opening

  • CIRCUIT_BREAKER_RECOVERY_TIMEOUT_S

    Circuit breaker recovery timeout in seconds

  • CIRCUIT_BREAKER_HALF_OPEN_MAX_CALLS

    Maximum calls allowed in circuit breaker half-open state

  • POSTGRESQL_CONNECTION_STRINGSecret

    Complete PostgreSQL connection string (overrides individual settings if provided)

  • POSTGRESQL_HOST

    PostgreSQL server host address

  • POSTGRESQL_PORT

    PostgreSQL server port number

  • POSTGRESQL_USER

    PostgreSQL database username

  • POSTGRESQL_PASSWORDSecret

    PostgreSQL database password

  • POSTGRESQL_DATABASE

    PostgreSQL database name

  • POSTGRESQL_POOL_MIN

    PostgreSQL connection pool minimum size

  • POSTGRESQL_POOL_MAX

    PostgreSQL connection pool maximum size

  • POSTGRESQL_POOL_TIMEOUT_S

    PostgreSQL connection pool timeout in seconds

  • POSTGRESQL_COMMAND_TIMEOUT_S

    PostgreSQL command execution timeout in seconds

  • POSTGRESQL_MIGRATION_TIMEOUT_S

    Timeout in seconds for PostgreSQL migration operations (default: 300)

  • POSTGRESQL_MAX_INACTIVE_LIFETIME_S

    Close idle PostgreSQL connections after this many seconds (0 to disable, default: 300)

  • POSTGRESQL_MAX_QUERIES

    Recycle PostgreSQL connections after this many queries (0 to disable, default: 10000)

  • POSTGRESQL_TCP_KEEPALIVES_IDLE_S

    Seconds of idle time before sending first TCP keepalive probe (0 to disable, default: 15)

  • POSTGRESQL_TCP_KEEPALIVES_INTERVAL_S

    Seconds between subsequent TCP keepalive probes (0 to disable, default: 5)

  • POSTGRESQL_TCP_KEEPALIVES_COUNT

    Number of failed TCP keepalive probes before connection is considered dead (0 to disable, default: 3)

  • POSTGRESQL_STATEMENT_CACHE_SIZE

    asyncpg prepared statement cache size. Set to 0 for external pooler compatibility (PgBouncer transaction mode, Pgpool-II, etc.). Default: 100

  • POSTGRESQL_MAX_CACHED_STATEMENT_LIFETIME_S

    Maximum lifetime of cached prepared statements in seconds (default: 300). Has no effect when statement_cache_size=0

  • POSTGRESQL_MAX_CACHEABLE_STATEMENT_SIZE

    Maximum size of statement to cache in bytes (default: 15360). Has no effect when statement_cache_size=0

  • POSTGRESQL_SSL_MODE

    PostgreSQL SSL mode (disable, allow, prefer, require, verify-ca, verify-full)

  • POSTGRESQL_SCHEMA

    PostgreSQL schema name for table and index operations (default: public)

  • ENABLE_SEMANTIC_SEARCH

    Enable semantic search functionality

  • ENABLE_EMBEDDING_GENERATION

    Enable embedding generation for stored context. Default true - server fails if dependencies not met. Set false to disable embeddings.

  • OLLAMA_HOST

    Ollama API host URL for embedding generation

  • OLLAMA_AUTO_PULL

    Automatically pull missing Ollama models on startup (default: true)

  • OLLAMA_PULL_TIMEOUT_S

    Timeout in seconds for pulling Ollama models (default: 900, range: 30-3600)

  • EMBEDDING_OLLAMA_TRUNCATE

    Ollama embedding truncation mode: false (default) returns error when context exceeded, true enables silent truncation

  • EMBEDDING_OLLAMA_NUM_CTX

    Ollama embedding context window size in tokens (default: 4096, range: 512-2097152)

  • EMBEDDING_MODEL

    Embedding model name for semantic search

  • EMBEDDING_DIM

    Embedding vector dimensions

  • EMBEDDING_TIMEOUT_S

    Timeout in seconds for embedding generation API calls

  • EMBEDDING_RETRY_MAX_ATTEMPTS

    Maximum number of retry attempts for embedding generation

  • EMBEDDING_RETRY_BASE_DELAY_S

    Base delay in seconds between retry attempts (with exponential backoff)

  • EMBEDDING_MAX_CONCURRENT

    Maximum concurrent embedding generation operations (default: 3, range: 1-20)

  • ENABLE_SUMMARY_GENERATION

    Enable summary generation for stored context. Default true - server fails if dependencies not met. Set false to disable summaries.

  • SUMMARY_PROVIDER

    Summary provider: ollama (default), openai, or anthropic

  • SUMMARY_MODEL

    Summary generation model name (default: qwen3:0.6b)

  • SUMMARY_MAX_TOKENS

    Maximum output tokens for summary generation (default: 4000, range: 50-16384). Increase if summaries are truncated by reasoning models

  • SUMMARY_TIMEOUT_S

    Timeout in seconds for summary generation API calls

  • SUMMARY_RETRY_MAX_ATTEMPTS

    Maximum number of retry attempts for summary generation

  • SUMMARY_RETRY_BASE_DELAY_S

    Base delay in seconds between retry attempts (with exponential backoff)

  • SUMMARY_MAX_CONCURRENT

    Maximum concurrent summary generation operations (default: 3, range: 1-20)

  • SUMMARY_PROMPT

    Custom summarization prompt. Overrides the built-in default. Used as system message for the LLM.

  • SUMMARY_MIN_CONTENT_LENGTH

    Minimum text content length in characters to trigger summary generation (default: 500, range: 0-10000). Set to 0 to always generate.

  • SUMMARY_OLLAMA_NUM_CTX

    Ollama summary context window size in tokens (default: 32768, range: 512-2097152)

  • SUMMARY_OLLAMA_TRUNCATE

    Ollama summary truncation mode: false (default) returns error when context exceeded, true enables silent truncation

  • SUMMARY_OPENAI_REASONING_EFFORT

    Reasoning effort level for OpenAI reasoning models (default: low). Valid values vary by generation: gpt-5: low, medium, high; gpt-5.1+: none, low, medium, high, xhigh. Default low is universally valid across all generations

  • SUMMARY_ANTHROPIC_EFFORT

    Effort level for Anthropic Claude models (default: none). Valid values: max, high, medium, low. Controls inference effort (adaptive thinking)

  • ANTHROPIC_API_KEYSecret

    Anthropic API key for summary generation

  • ENABLE_FTS

    Enable full-text search functionality

  • FTS_LANGUAGE

    Language for FTS stemming (e.g., english, german, french)

  • FTS_RERANK_WINDOW_SIZE

    Characters of context around each FTS match for reranking passage extraction (default: 750)

  • FTS_RERANK_GAP_MERGE

    Merge FTS match regions within this character distance (default: 100)

  • ENABLE_HYBRID_SEARCH

    Enable hybrid search combining FTS and semantic search with RRF fusion

  • HYBRID_RRF_K

    RRF smoothing constant for hybrid search (default 60)

  • HYBRID_RRF_OVERFETCH

    Multiplier for over-fetching results before RRF fusion (default: 2)

  • HYBRID_FTS_OR_THRESHOLD

    Minimum significant query terms to switch hybrid FTS from AND to OR logic (default: 4)

  • SEARCH_DEFAULT_SORT_BY

    Default sort order for search results: relevance (only 'relevance' supported in current version)

  • SEARCH_TRUNCATION_LENGTH

    Maximum character length for truncated text_content in search results (default: 300, range: 50-1000)

  • ENABLE_CHUNKING

    Enable text chunking for embedding generation (default: true)

  • CHUNK_SIZE

    Target chunk size in characters (default: 1500)

  • CHUNK_OVERLAP

    Overlap between chunks in characters (default: 150)

  • CHUNK_AGGREGATION

    Chunk score aggregation method: max (only 'max' supported in current version)

  • CHUNK_DEDUP_OVERFETCH

    Multiplier for over-fetching chunks before deduplication (default: 5)

  • ENABLE_RERANKING

    Enable cross-encoder reranking of search results (default: true)

  • RERANKING_PROVIDER

    Reranking provider (default: flashrank)

  • RERANKING_MODEL

    Reranking model name (default: ms-marco-MiniLM-L-12-v2)

  • RERANKING_MAX_LENGTH

    Maximum input length for reranking in tokens (default: 512)

  • RERANKING_OVERFETCH

    Multiplier for over-fetching results before reranking (default: 4)

  • RERANKING_CACHE_DIR

    Directory for caching reranking models

  • RERANKING_CHARS_PER_TOKEN

    Estimated characters per token for passage size validation (default: 4.0, range: 2.0-8.0)

  • RERANKING_INTRA_OP_THREADS

    ONNX Runtime intra-operation parallelism threads for reranking (default: 0 = auto-detect)

  • RERANKING_CPU_MEM_ARENA

    Enable ONNX Runtime CPU memory arena for reranking (default: false)

  • RERANKING_BATCH_SIZE

    Maximum passages per ONNX Runtime inference batch during reranking (default: 32)

  • EMBEDDING_PROVIDER

    Embedding provider: ollama (default), openai, azure, huggingface, or voyage

  • OPENAI_API_KEYSecret

    OpenAI API key for OpenAI embedding provider

  • OPENAI_API_BASE

    Custom base URL for OpenAI-compatible APIs

  • OPENAI_ORGANIZATION

    OpenAI organization ID

  • AZURE_OPENAI_API_KEYSecret

    Azure OpenAI API key

  • AZURE_OPENAI_ENDPOINT

    Azure OpenAI endpoint URL

  • AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME

    Azure OpenAI embedding deployment name

  • AZURE_OPENAI_API_VERSION

    Azure OpenAI API version (default: 2024-02-01)

  • HUGGINGFACEHUB_API_TOKENSecret

    HuggingFace Hub API token for HuggingFace embedding provider

  • VOYAGE_API_KEYSecret

    Voyage AI API key for Voyage embedding provider

  • VOYAGE_TRUNCATION

    Voyage AI truncation mode: false (default) returns error when context exceeded, true enables silent truncation

  • VOYAGE_BATCH_SIZE

    Voyage AI batch size for embedding requests

  • LANGSMITH_TRACING

    Enable LangSmith tracing

  • LANGSMITH_API_KEYSecret

    LangSmith API key

  • LANGSMITH_PROJECT

    LangSmith project name

  • LANGSMITH_ENDPOINT

    LangSmith API endpoint URL

  • METADATA_INDEXED_FIELDS

    Comma-separated list of metadata fields to index (field:type format)

  • METADATA_INDEX_SYNC_MODE

    Index sync mode: strict (fail), auto (sync), warn (log), additive (default, add missing only)

  • MCP_TRANSPORT

    Transport mode: stdio for local, http for Docker/remote

  • FASTMCP_HOST

    HTTP bind address (use 0.0.0.0 for Docker)

  • FASTMCP_PORT

    HTTP port number

  • FASTMCP_STATELESS_HTTP

    Enable stateless HTTP mode for horizontal scaling. Enabled by default as the server has no stateful MCP features. Set to false only if you need server-side MCP session tracking.

  • DISABLED_TOOLS

    Comma-separated list of tools to disable (e.g., delete_context,update_context)

  • MCP_AUTH_TOKENSecret

    Bearer token for HTTP authentication (required when using SimpleTokenVerifier)

  • MCP_AUTH_CLIENT_ID

    Client ID to assign to authenticated requests

  • MCP_AUTH_PROVIDER

    Authentication provider: none (default), simple_token

  • MCP_SERVER_INSTRUCTIONS

    Custom server instructions text. Overrides built-in default. Set to empty string to disable.

Resources

Where to find authoritative docs and source for mcp-context-server.

Try mcp-context-server with 30+ AI models

Open MCP Agent Studio and connect this server to Claude, GPT, Gemini, DeepSeek and more — no install required.

Open Agent Studio

Related servers

More on MCP Playground