Rewrite README to highlight the two fork-specific features: - BM25 hybrid search (dense + sparse vectors with RRF) - Automatic project tagging with metadata.project index Also update the environment variables table with all current options. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mcp-server-qdrant: Hybrid Search Fork
Forked from qdrant/mcp-server-qdrant — the official MCP server for Qdrant.
An MCP server for Qdrant vector search engine that acts as a semantic memory layer for LLM applications.
This fork adds two features on top of the upstream:
- Hybrid Search — combines dense (semantic) and sparse (BM25 keyword) vectors using Reciprocal Rank Fusion for significantly better recall
- Project Tagging — automatic
projectmetadata on every stored memory, with a payload index for efficient filtering
Everything else remains fully compatible with the upstream.
What's Different in This Fork
Hybrid Search (Dense + BM25 Sparse with RRF)
The upstream server uses dense vectors only (semantic similarity). This works well for paraphrased queries but can miss results when the user searches for exact terms, names, or identifiers.
This fork adds BM25 sparse vectors alongside the dense ones. At query time, both vector spaces are searched independently and results are fused using Reciprocal Rank Fusion (RRF) — a proven technique that combines rankings without requiring score calibration.
How it works:
Store: document → [dense embedding] + [BM25 sparse embedding] → Qdrant
Search: query → prefetch(dense, top-k) + prefetch(BM25, top-k) → RRF fusion → final results
- Dense vectors capture semantic meaning (synonyms, paraphrases, context)
- BM25 sparse vectors excel at exact keyword matching (names, IDs, error codes)
- RRF fusion gives you the best of both worlds
Enable it with a single environment variable:
HYBRID_SEARCH=true
Note
Hybrid search uses the
Qdrant/bm25model from FastEmbed for sparse embeddings. The model is downloaded automatically on first use (~50 MB). The IDF modifier is applied to upweight rare terms in the corpus.
Important
Enabling hybrid search on an existing collection requires re-creating it, as the sparse vector configuration must be set at collection creation time. Back up your data before switching.
Project Tagging
The qdrant-store tool now accepts a project parameter (default: "global"). This value is automatically injected into the metadata of every stored record and indexed as a keyword field for efficient filtering.
This is useful when multiple projects share the same Qdrant collection — you can tag memories with the project name and filter by it later.
qdrant-store(information="...", project="my-project")
→ metadata: {"project": "my-project", ...}
A payload index on metadata.project is created automatically when the collection is first set up.
Tools
qdrant-store
Store information in the Qdrant database.
| Parameter | Type | Required | Description |
|---|---|---|---|
information |
string | yes | Text to store |
project |
string | no | Project name to tag this memory with. Default: "global". Use the project name (e.g. "my-app") for project-specific knowledge, or "global" for cross-project knowledge. |
metadata |
JSON | no | Extra metadata stored alongside the information |
collection_name |
string | depends | Collection name. Required if no default is configured. |
qdrant-find
Retrieve relevant information from the Qdrant database.
| Parameter | Type | Required | Description |
|---|---|---|---|
query |
string | yes | What to search for |
collection_name |
string | depends | Collection name. Required if no default is configured. |
Environment Variables
| Name | Description | Default |
|---|---|---|
QDRANT_URL |
URL of the Qdrant server | None |
QDRANT_API_KEY |
API key for the Qdrant server | None |
QDRANT_LOCAL_PATH |
Path to local Qdrant database (alternative to QDRANT_URL) |
None |
COLLECTION_NAME |
Default collection name | None |
EMBEDDING_PROVIDER |
Embedding provider (currently only fastembed) |
fastembed |
EMBEDDING_MODEL |
Embedding model name | sentence-transformers/all-MiniLM-L6-v2 |
HYBRID_SEARCH |
Enable hybrid search (dense + BM25 sparse with RRF) | false |
QDRANT_SEARCH_LIMIT |
Maximum number of results per search | 10 |
QDRANT_READ_ONLY |
Disable write operations (store tool) | false |
QDRANT_ALLOW_ARBITRARY_FILTER |
Allow arbitrary filter objects in find queries | false |
TOOL_STORE_DESCRIPTION |
Custom description for the store tool | See settings.py |
TOOL_FIND_DESCRIPTION |
Custom description for the find tool | See settings.py |
Important
You cannot provide both
QDRANT_URLandQDRANT_LOCAL_PATHat the same time.
FastMCP Environment Variables
Since mcp-server-qdrant is based on FastMCP, it also supports all FastMCP environment variables:
| Name | Description | Default |
|---|---|---|
FASTMCP_DEBUG |
Enable debug mode | false |
FASTMCP_LOG_LEVEL |
Log level (DEBUG, INFO, WARNING, ERROR, CRITICAL) | INFO |
FASTMCP_HOST |
Host address to bind to | 127.0.0.1 |
FASTMCP_PORT |
Port to run the server on | 8000 |
Installation
Using uvx
No installation needed with uvx:
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
HYBRID_SEARCH=true \
uvx mcp-server-qdrant
Transport Protocols
# SSE transport (for remote clients)
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
HYBRID_SEARCH=true \
uvx mcp-server-qdrant --transport sse
Supported transports:
stdio(default) — for local MCP clientssse— Server-Sent Events, for remote clientsstreamable-http— streamable HTTP, newer alternative to SSE
Using Docker
docker build -t mcp-server-qdrant .
docker run -p 8000:8000 \
-e FASTMCP_HOST="0.0.0.0" \
-e QDRANT_URL="http://your-qdrant-server:6333" \
-e QDRANT_API_KEY="your-api-key" \
-e COLLECTION_NAME="your-collection" \
-e HYBRID_SEARCH=true \
mcp-server-qdrant
Installing via Smithery
npx @smithery/cli install mcp-server-qdrant --client claude
Usage with MCP Clients
Claude Desktop
Add to claude_desktop_config.json:
{
"qdrant": {
"command": "uvx",
"args": ["mcp-server-qdrant"],
"env": {
"QDRANT_URL": "https://your-qdrant-instance:6333",
"QDRANT_API_KEY": "your_api_key",
"COLLECTION_NAME": "your-collection",
"EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2",
"HYBRID_SEARCH": "true"
}
}
}
Claude Code
claude mcp add qdrant-memory \
-e QDRANT_URL="http://localhost:6333" \
-e COLLECTION_NAME="my-memory" \
-e HYBRID_SEARCH="true" \
-- uvx mcp-server-qdrant
Verify:
claude mcp list
Cursor / Windsurf
Run the server with SSE transport and custom tool descriptions for code search:
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="code-snippets" \
HYBRID_SEARCH=true \
TOOL_STORE_DESCRIPTION="Store reusable code snippets for later retrieval. \
The 'information' parameter should contain a natural language description of what the code does, \
while the actual code should be included in the 'metadata' parameter as a 'code' property." \
TOOL_FIND_DESCRIPTION="Search for relevant code snippets based on natural language descriptions." \
uvx mcp-server-qdrant --transport sse
Then point Cursor/Windsurf to http://localhost:8000/sse.
VS Code
For one-click installation, click one of the install buttons below:
Or add manually to VS Code settings (Ctrl+Shift+P → Preferences: Open User Settings (JSON)):
{
"mcp": {
"inputs": [
{"type": "promptString", "id": "qdrantUrl", "description": "Qdrant URL"},
{"type": "promptString", "id": "qdrantApiKey", "description": "Qdrant API Key", "password": true},
{"type": "promptString", "id": "collectionName", "description": "Collection Name"}
],
"servers": {
"qdrant": {
"command": "uvx",
"args": ["mcp-server-qdrant"],
"env": {
"QDRANT_URL": "${input:qdrantUrl}",
"QDRANT_API_KEY": "${input:qdrantApiKey}",
"COLLECTION_NAME": "${input:collectionName}",
"HYBRID_SEARCH": "true"
}
}
}
}
}
Development
Run in development mode with the MCP inspector:
COLLECTION_NAME=mcp-dev HYBRID_SEARCH=true \
fastmcp dev src/mcp_server_qdrant/server.py
Open http://localhost:5173 to access the inspector.
How Hybrid Search Works Under the Hood
When HYBRID_SEARCH=true:
Storing:
- The document is embedded with the dense model (e.g.
all-MiniLM-L6-v2) → semantic vector - The document is also embedded with
Qdrant/bm25→ sparse vector (term frequencies with IDF) - Both vectors are stored in the same Qdrant point
Searching:
- The query is embedded with both models
- Two independent prefetch queries run in parallel:
- Dense vector search (cosine similarity)
- BM25 sparse vector search (dot product with IDF weighting)
- Results are fused using Reciprocal Rank Fusion:
score = 1/(k + rank_dense) + 1/(k + rank_sparse) - Top-k fused results are returned
This approach is battle-tested in information retrieval and consistently outperforms either method alone, especially for queries that mix natural language with specific terms.
Acknowledgments
This is a fork of qdrant/mcp-server-qdrant. All credit for the original implementation goes to the Qdrant team.
License
Apache License 2.0 — see LICENSE for details.