Files
mcp-server-qdrant/README.md
Mr. Kutin 8bcc45ee14
Some checks failed
pre-commit / main (push) Has been cancelled
Run Tests / Python 3.10 (push) Has been cancelled
Run Tests / Python 3.11 (push) Has been cancelled
Run Tests / Python 3.12 (push) Has been cancelled
Run Tests / Python 3.13 (push) Has been cancelled
Update README: document hybrid search and project tagging features
Rewrite README to highlight the two fork-specific features:
- BM25 hybrid search (dense + sparse vectors with RRF)
- Automatic project tagging with metadata.project index

Also update the environment variables table with all current options.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 19:58:39 +03:00

12 KiB

mcp-server-qdrant: Hybrid Search Fork

Forked from qdrant/mcp-server-qdrant — the official MCP server for Qdrant.

An MCP server for Qdrant vector search engine that acts as a semantic memory layer for LLM applications.

This fork adds two features on top of the upstream:

  1. Hybrid Search — combines dense (semantic) and sparse (BM25 keyword) vectors using Reciprocal Rank Fusion for significantly better recall
  2. Project Tagging — automatic project metadata on every stored memory, with a payload index for efficient filtering

Everything else remains fully compatible with the upstream.


What's Different in This Fork

Hybrid Search (Dense + BM25 Sparse with RRF)

The upstream server uses dense vectors only (semantic similarity). This works well for paraphrased queries but can miss results when the user searches for exact terms, names, or identifiers.

This fork adds BM25 sparse vectors alongside the dense ones. At query time, both vector spaces are searched independently and results are fused using Reciprocal Rank Fusion (RRF) — a proven technique that combines rankings without requiring score calibration.

How it works:

Store: document → [dense embedding] + [BM25 sparse embedding] → Qdrant
Search: query → prefetch(dense, top-k) + prefetch(BM25, top-k) → RRF fusion → final results
  • Dense vectors capture semantic meaning (synonyms, paraphrases, context)
  • BM25 sparse vectors excel at exact keyword matching (names, IDs, error codes)
  • RRF fusion gives you the best of both worlds

Enable it with a single environment variable:

HYBRID_SEARCH=true

Note

Hybrid search uses the Qdrant/bm25 model from FastEmbed for sparse embeddings. The model is downloaded automatically on first use (~50 MB). The IDF modifier is applied to upweight rare terms in the corpus.

Important

Enabling hybrid search on an existing collection requires re-creating it, as the sparse vector configuration must be set at collection creation time. Back up your data before switching.

Project Tagging

The qdrant-store tool now accepts a project parameter (default: "global"). This value is automatically injected into the metadata of every stored record and indexed as a keyword field for efficient filtering.

This is useful when multiple projects share the same Qdrant collection — you can tag memories with the project name and filter by it later.

qdrant-store(information="...", project="my-project")
→ metadata: {"project": "my-project", ...}

A payload index on metadata.project is created automatically when the collection is first set up.


Tools

qdrant-store

Store information in the Qdrant database.

Parameter Type Required Description
information string yes Text to store
project string no Project name to tag this memory with. Default: "global". Use the project name (e.g. "my-app") for project-specific knowledge, or "global" for cross-project knowledge.
metadata JSON no Extra metadata stored alongside the information
collection_name string depends Collection name. Required if no default is configured.

qdrant-find

Retrieve relevant information from the Qdrant database.

Parameter Type Required Description
query string yes What to search for
collection_name string depends Collection name. Required if no default is configured.

Environment Variables

Name Description Default
QDRANT_URL URL of the Qdrant server None
QDRANT_API_KEY API key for the Qdrant server None
QDRANT_LOCAL_PATH Path to local Qdrant database (alternative to QDRANT_URL) None
COLLECTION_NAME Default collection name None
EMBEDDING_PROVIDER Embedding provider (currently only fastembed) fastembed
EMBEDDING_MODEL Embedding model name sentence-transformers/all-MiniLM-L6-v2
HYBRID_SEARCH Enable hybrid search (dense + BM25 sparse with RRF) false
QDRANT_SEARCH_LIMIT Maximum number of results per search 10
QDRANT_READ_ONLY Disable write operations (store tool) false
QDRANT_ALLOW_ARBITRARY_FILTER Allow arbitrary filter objects in find queries false
TOOL_STORE_DESCRIPTION Custom description for the store tool See settings.py
TOOL_FIND_DESCRIPTION Custom description for the find tool See settings.py

Important

You cannot provide both QDRANT_URL and QDRANT_LOCAL_PATH at the same time.

FastMCP Environment Variables

Since mcp-server-qdrant is based on FastMCP, it also supports all FastMCP environment variables:

Name Description Default
FASTMCP_DEBUG Enable debug mode false
FASTMCP_LOG_LEVEL Log level (DEBUG, INFO, WARNING, ERROR, CRITICAL) INFO
FASTMCP_HOST Host address to bind to 127.0.0.1
FASTMCP_PORT Port to run the server on 8000

Installation

Using uvx

No installation needed with uvx:

QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
HYBRID_SEARCH=true \
uvx mcp-server-qdrant

Transport Protocols

# SSE transport (for remote clients)
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
HYBRID_SEARCH=true \
uvx mcp-server-qdrant --transport sse

Supported transports:

  • stdio (default) — for local MCP clients
  • sse — Server-Sent Events, for remote clients
  • streamable-http — streamable HTTP, newer alternative to SSE

Using Docker

docker build -t mcp-server-qdrant .

docker run -p 8000:8000 \
  -e FASTMCP_HOST="0.0.0.0" \
  -e QDRANT_URL="http://your-qdrant-server:6333" \
  -e QDRANT_API_KEY="your-api-key" \
  -e COLLECTION_NAME="your-collection" \
  -e HYBRID_SEARCH=true \
  mcp-server-qdrant

Installing via Smithery

npx @smithery/cli install mcp-server-qdrant --client claude

Usage with MCP Clients

Claude Desktop

Add to claude_desktop_config.json:

{
  "qdrant": {
    "command": "uvx",
    "args": ["mcp-server-qdrant"],
    "env": {
      "QDRANT_URL": "https://your-qdrant-instance:6333",
      "QDRANT_API_KEY": "your_api_key",
      "COLLECTION_NAME": "your-collection",
      "EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2",
      "HYBRID_SEARCH": "true"
    }
  }
}

Claude Code

claude mcp add qdrant-memory \
  -e QDRANT_URL="http://localhost:6333" \
  -e COLLECTION_NAME="my-memory" \
  -e HYBRID_SEARCH="true" \
  -- uvx mcp-server-qdrant

Verify:

claude mcp list

Cursor / Windsurf

Run the server with SSE transport and custom tool descriptions for code search:

QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="code-snippets" \
HYBRID_SEARCH=true \
TOOL_STORE_DESCRIPTION="Store reusable code snippets for later retrieval. \
The 'information' parameter should contain a natural language description of what the code does, \
while the actual code should be included in the 'metadata' parameter as a 'code' property." \
TOOL_FIND_DESCRIPTION="Search for relevant code snippets based on natural language descriptions." \
uvx mcp-server-qdrant --transport sse

Then point Cursor/Windsurf to http://localhost:8000/sse.

VS Code

For one-click installation, click one of the install buttons below:

Install with UVX in VS Code Install with UVX in VS Code Insiders

Or add manually to VS Code settings (Ctrl+Shift+PPreferences: Open User Settings (JSON)):

{
  "mcp": {
    "inputs": [
      {"type": "promptString", "id": "qdrantUrl", "description": "Qdrant URL"},
      {"type": "promptString", "id": "qdrantApiKey", "description": "Qdrant API Key", "password": true},
      {"type": "promptString", "id": "collectionName", "description": "Collection Name"}
    ],
    "servers": {
      "qdrant": {
        "command": "uvx",
        "args": ["mcp-server-qdrant"],
        "env": {
          "QDRANT_URL": "${input:qdrantUrl}",
          "QDRANT_API_KEY": "${input:qdrantApiKey}",
          "COLLECTION_NAME": "${input:collectionName}",
          "HYBRID_SEARCH": "true"
        }
      }
    }
  }
}

Development

Run in development mode with the MCP inspector:

COLLECTION_NAME=mcp-dev HYBRID_SEARCH=true \
fastmcp dev src/mcp_server_qdrant/server.py

Open http://localhost:5173 to access the inspector.

How Hybrid Search Works Under the Hood

When HYBRID_SEARCH=true:

Storing:

  1. The document is embedded with the dense model (e.g. all-MiniLM-L6-v2) → semantic vector
  2. The document is also embedded with Qdrant/bm25 → sparse vector (term frequencies with IDF)
  3. Both vectors are stored in the same Qdrant point

Searching:

  1. The query is embedded with both models
  2. Two independent prefetch queries run in parallel:
    • Dense vector search (cosine similarity)
    • BM25 sparse vector search (dot product with IDF weighting)
  3. Results are fused using Reciprocal Rank Fusion: score = 1/(k + rank_dense) + 1/(k + rank_sparse)
  4. Top-k fused results are returned

This approach is battle-tested in information retrieval and consistently outperforms either method alone, especially for queries that mix natural language with specific terms.

Acknowledgments

This is a fork of qdrant/mcp-server-qdrant. All credit for the original implementation goes to the Qdrant team.

License

Apache License 2.0 — see LICENSE for details.