mrkutin/mcp-server-qdrant

Fork 0

Files

Mr. Kutin 8bcc45ee14

pre-commit / main (push) Has been cancelled

Details

Run Tests / Python 3.10 (push) Has been cancelled

Details

Run Tests / Python 3.11 (push) Has been cancelled

Details

Run Tests / Python 3.12 (push) Has been cancelled

Details

Run Tests / Python 3.13 (push) Has been cancelled

Details

Update README: document hybrid search and project tagging features

Rewrite README to highlight the two fork-specific features:
- BM25 hybrid search (dense + sparse vectors with RRF)
- Automatic project tagging with metadata.project index

Also update the environment variables table with all current options.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-03 19:58:39 +03:00

12 KiB

Raw Blame History

mcp-server-qdrant: Hybrid Search Fork

Forked from qdrant/mcp-server-qdrant — the official MCP server for Qdrant.

An MCP server for Qdrant vector search engine that acts as a semantic memory layer for LLM applications.

This fork adds two features on top of the upstream:

Hybrid Search — combines dense (semantic) and sparse (BM25 keyword) vectors using Reciprocal Rank Fusion for significantly better recall
Project Tagging — automatic project metadata on every stored memory, with a payload index for efficient filtering

Everything else remains fully compatible with the upstream.

What's Different in This Fork

Hybrid Search (Dense + BM25 Sparse with RRF)

The upstream server uses dense vectors only (semantic similarity). This works well for paraphrased queries but can miss results when the user searches for exact terms, names, or identifiers.

This fork adds BM25 sparse vectors alongside the dense ones. At query time, both vector spaces are searched independently and results are fused using Reciprocal Rank Fusion (RRF) — a proven technique that combines rankings without requiring score calibration.

How it works:

Store: document → [dense embedding] + [BM25 sparse embedding] → Qdrant
Search: query → prefetch(dense, top-k) + prefetch(BM25, top-k) → RRF fusion → final results

Dense vectors capture semantic meaning (synonyms, paraphrases, context)
BM25 sparse vectors excel at exact keyword matching (names, IDs, error codes)
RRF fusion gives you the best of both worlds

Enable it with a single environment variable:

HYBRID_SEARCH=true

Note

Hybrid search uses the Qdrant/bm25 model from FastEmbed for sparse embeddings. The model is downloaded automatically on first use (~50 MB). The IDF modifier is applied to upweight rare terms in the corpus.

Important

Enabling hybrid search on an existing collection requires re-creating it, as the sparse vector configuration must be set at collection creation time. Back up your data before switching.

Project Tagging

The qdrant-store tool now accepts a project parameter (default: "global"). This value is automatically injected into the metadata of every stored record and indexed as a keyword field for efficient filtering.

This is useful when multiple projects share the same Qdrant collection — you can tag memories with the project name and filter by it later.

qdrant-store(information="...", project="my-project")
→ metadata: {"project": "my-project", ...}

A payload index on metadata.project is created automatically when the collection is first set up.

Tools

`qdrant-store`

Store information in the Qdrant database.

Parameter	Type	Required	Description
`information`	string	yes	Text to store
`project`	string	no	Project name to tag this memory with. Default: `"global"`. Use the project name (e.g. `"my-app"`) for project-specific knowledge, or `"global"` for cross-project knowledge.
`metadata`	JSON	no	Extra metadata stored alongside the information
`collection_name`	string	depends	Collection name. Required if no default is configured.

`qdrant-find`

Retrieve relevant information from the Qdrant database.

Parameter	Type	Required	Description
`query`	string	yes	What to search for
`collection_name`	string	depends	Collection name. Required if no default is configured.

Environment Variables

Name	Description	Default
`QDRANT_URL`	URL of the Qdrant server	None
`QDRANT_API_KEY`	API key for the Qdrant server	None
`QDRANT_LOCAL_PATH`	Path to local Qdrant database (alternative to `QDRANT_URL`)	None
`COLLECTION_NAME`	Default collection name	None
`EMBEDDING_PROVIDER`	Embedding provider (currently only `fastembed`)	`fastembed`
`EMBEDDING_MODEL`	Embedding model name	`sentence-transformers/all-MiniLM-L6-v2`
`HYBRID_SEARCH`	Enable hybrid search (dense + BM25 sparse with RRF)	`false`
`QDRANT_SEARCH_LIMIT`	Maximum number of results per search	`10`
`QDRANT_READ_ONLY`	Disable write operations (store tool)	`false`
`QDRANT_ALLOW_ARBITRARY_FILTER`	Allow arbitrary filter objects in find queries	`false`
`TOOL_STORE_DESCRIPTION`	Custom description for the store tool	See `settings.py`
`TOOL_FIND_DESCRIPTION`	Custom description for the find tool	See `settings.py`

Important

You cannot provide both QDRANT_URL and QDRANT_LOCAL_PATH at the same time.

FastMCP Environment Variables

Since mcp-server-qdrant is based on FastMCP, it also supports all FastMCP environment variables:

Name	Description	Default
`FASTMCP_DEBUG`	Enable debug mode	`false`
`FASTMCP_LOG_LEVEL`	Log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)	`INFO`
`FASTMCP_HOST`	Host address to bind to	`127.0.0.1`
`FASTMCP_PORT`	Port to run the server on	`8000`

Installation

Using uvx

No installation needed with uvx:

QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
HYBRID_SEARCH=true \
uvx mcp-server-qdrant

Transport Protocols

# SSE transport (for remote clients)
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
HYBRID_SEARCH=true \
uvx mcp-server-qdrant --transport sse

Supported transports:

stdio (default) — for local MCP clients
sse — Server-Sent Events, for remote clients
streamable-http — streamable HTTP, newer alternative to SSE

Using Docker

docker build -t mcp-server-qdrant .

docker run -p 8000:8000 \
  -e FASTMCP_HOST="0.0.0.0" \
  -e QDRANT_URL="http://your-qdrant-server:6333" \
  -e QDRANT_API_KEY="your-api-key" \
  -e COLLECTION_NAME="your-collection" \
  -e HYBRID_SEARCH=true \
  mcp-server-qdrant

Installing via Smithery

npx @smithery/cli install mcp-server-qdrant --client claude

Usage with MCP Clients

Claude Desktop

Add to claude_desktop_config.json:

{
  "qdrant": {
    "command": "uvx",
    "args": ["mcp-server-qdrant"],
    "env": {
      "QDRANT_URL": "https://your-qdrant-instance:6333",
      "QDRANT_API_KEY": "your_api_key",
      "COLLECTION_NAME": "your-collection",
      "EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2",
      "HYBRID_SEARCH": "true"
    }
  }
}

Claude Code

claude mcp add qdrant-memory \
  -e QDRANT_URL="http://localhost:6333" \
  -e COLLECTION_NAME="my-memory" \
  -e HYBRID_SEARCH="true" \
  -- uvx mcp-server-qdrant

Verify:

claude mcp list

Cursor / Windsurf

Run the server with SSE transport and custom tool descriptions for code search:

QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="code-snippets" \
HYBRID_SEARCH=true \
TOOL_STORE_DESCRIPTION="Store reusable code snippets for later retrieval. \
The 'information' parameter should contain a natural language description of what the code does, \
while the actual code should be included in the 'metadata' parameter as a 'code' property." \
TOOL_FIND_DESCRIPTION="Search for relevant code snippets based on natural language descriptions." \
uvx mcp-server-qdrant --transport sse

Then point Cursor/Windsurf to http://localhost:8000/sse.

VS Code

For one-click installation, click one of the install buttons below:

Or add manually to VS Code settings (Ctrl+Shift+P → Preferences: Open User Settings (JSON)):

{
  "mcp": {
    "inputs": [
      {"type": "promptString", "id": "qdrantUrl", "description": "Qdrant URL"},
      {"type": "promptString", "id": "qdrantApiKey", "description": "Qdrant API Key", "password": true},
      {"type": "promptString", "id": "collectionName", "description": "Collection Name"}
    ],
    "servers": {
      "qdrant": {
        "command": "uvx",
        "args": ["mcp-server-qdrant"],
        "env": {
          "QDRANT_URL": "${input:qdrantUrl}",
          "QDRANT_API_KEY": "${input:qdrantApiKey}",
          "COLLECTION_NAME": "${input:collectionName}",
          "HYBRID_SEARCH": "true"
        }
      }
    }
  }
}

Development

Run in development mode with the MCP inspector:

COLLECTION_NAME=mcp-dev HYBRID_SEARCH=true \
fastmcp dev src/mcp_server_qdrant/server.py

Open http://localhost:5173 to access the inspector.

How Hybrid Search Works Under the Hood

When HYBRID_SEARCH=true:

Storing:

The document is embedded with the dense model (e.g. all-MiniLM-L6-v2) → semantic vector
The document is also embedded with Qdrant/bm25 → sparse vector (term frequencies with IDF)
Both vectors are stored in the same Qdrant point

Searching:

The query is embedded with both models
Two independent prefetch queries run in parallel:
- Dense vector search (cosine similarity)
- BM25 sparse vector search (dot product with IDF weighting)
Results are fused using Reciprocal Rank Fusion: score = 1/(k + rank_dense) + 1/(k + rank_sparse)
Top-k fused results are returned

This approach is battle-tested in information retrieval and consistently outperforms either method alone, especially for queries that mix natural language with specific terms.

Acknowledgments

This is a fork of qdrant/mcp-server-qdrant. All credit for the original implementation goes to the Qdrant team.

License

Apache License 2.0 — see LICENSE for details.

12 KiB Raw Blame History

mcp-server-qdrant: Hybrid Search Fork

What's Different in This Fork

Hybrid Search (Dense + BM25 Sparse with RRF)

Project Tagging

Tools

qdrant-store

qdrant-find

Environment Variables

FastMCP Environment Variables

Installation

Using uvx

Transport Protocols

Using Docker

Installing via Smithery

Usage with MCP Clients

Claude Desktop

Claude Code

Cursor / Windsurf

VS Code

Development

How Hybrid Search Works Under the Hood

Acknowledgments

License

12 KiB

Raw Blame History

`qdrant-store`

`qdrant-find`