Update README: document hybrid search and project tagging features

Rewrite README to highlight the two fork-specific features: - BM25 hybrid search (dense + sparse vectors with RRF) - Automatic project tagging with metadata.project index Also update the environment variables table with all current options. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add project tagging to store tool and metadata.project index
2026-04-03 19:58:39 +03:00 · 2026-04-03 10:43:05 +03:00 · 2026-04-03 10:27:52 +03:00
7 changed files with 320 additions and 391 deletions
--- a/README.md
+++ b/README.md
@@ -1,149 +1,170 @@
-# mcp-server-qdrant: A Qdrant MCP server
+# mcp-server-qdrant: Hybrid Search Fork
-[![smithery badge](https://smithery.ai/badge/mcp-server-qdrant)](https://smithery.ai/protocol/mcp-server-qdrant)
+> Forked from [qdrant/mcp-server-qdrant](https://github.com/qdrant/mcp-server-qdrant) — the official MCP server for Qdrant.
-> The [Model Context Protocol (MCP)](https://modelcontextprotocol.io/introduction) is an open protocol that enables
+An [MCP](https://modelcontextprotocol.io/introduction) server for [Qdrant](https://qdrant.tech/) vector search engine that acts as a **semantic memory layer** for LLM applications.
 > seamless integration between LLM applications and external data sources and tools. Whether you're building an
 > AI-powered IDE, enhancing a chat interface, or creating custom AI workflows, MCP provides a standardized way to
 > connect LLMs with the context they need.
-This repository is an example of how to create a MCP server for [Qdrant](https://qdrant.tech/), a vector search engine.
+This fork adds two features on top of the upstream:
-## Overview
+1. **Hybrid Search** — combines dense (semantic) and sparse (BM25 keyword) vectors using Reciprocal Rank Fusion for significantly better recall
 2. **Project Tagging** — automatic `project` metadata on every stored memory, with a payload index for efficient filtering
-An official Model Context Protocol server for keeping and retrieving memories in the Qdrant vector search engine.
+Everything else remains fully compatible with the upstream.
 It acts as a semantic memory layer on top of the Qdrant database.
-## Components
+---
-### Tools
+## What's Different in This Fork
-1. `qdrant-store`
+### Hybrid Search (Dense + BM25 Sparse with RRF)
-   - Store some information in the Qdrant database
+
-   - Input:
+The upstream server uses **dense vectors only** (semantic similarity). This works well for paraphrased queries but can miss results when the user searches for exact terms, names, or identifiers.
-     - `information` (string): Information to store
+
-     - `metadata` (JSON): Optional metadata to store
+This fork adds **BM25 sparse vectors** alongside the dense ones. At query time, both vector spaces are searched independently and results are fused using **Reciprocal Rank Fusion (RRF)** — a proven technique that combines rankings without requiring score calibration.
-     - `collection_name` (string): Name of the collection to store the information in. This field is required if there are no default collection name.
+
-                                   If there is a default collection name, this field is not enabled.
+**How it works:**
-   - Returns: Confirmation message
+
-2. `qdrant-find`
+```
-   - Retrieve relevant information from the Qdrant database
+Store: document → [dense embedding] + [BM25 sparse embedding] → Qdrant
-   - Input:
+Search: query → prefetch(dense, top-k) + prefetch(BM25, top-k) → RRF fusion → final results
-     - `query` (string): Query to use for searching
+```
-     - `collection_name` (string): Name of the collection to store the information in. This field is required if there are no default collection name.
+
-                                   If there is a default collection name, this field is not enabled.
+- Dense vectors capture **semantic meaning** (synonyms, paraphrases, context)
-   - Returns: Information stored in the Qdrant database as separate messages
+- BM25 sparse vectors excel at **exact keyword matching** (names, IDs, error codes)
 - RRF fusion gives you the best of both worlds
 **Enable it** with a single environment variable:
 ```bash
 HYBRID_SEARCH=true
 ```
 > [!NOTE]
 > Hybrid search uses the `Qdrant/bm25` model from [FastEmbed](https://qdrant.github.io/fastembed/) for sparse embeddings. The model is downloaded automatically on first use (~50 MB). The IDF modifier is applied to upweight rare terms in the corpus.
 > [!IMPORTANT]
 > Enabling hybrid search on an existing collection requires re-creating it, as the sparse vector configuration must be set at collection creation time. Back up your data before switching.
 ### Project Tagging
 The `qdrant-store` tool now accepts a `project` parameter (default: `"global"`). This value is automatically injected into the metadata of every stored record and indexed as a keyword field for efficient filtering.
 This is useful when multiple projects share the same Qdrant collection — you can tag memories with the project name and filter by it later.
 ```
 qdrant-store(information="...", project="my-project")
 → metadata: {"project": "my-project", ...}
 ```
 A payload index on `metadata.project` is created automatically when the collection is first set up.
 ---
 ## Tools
 ### `qdrant-store`
 Store information in the Qdrant database.
 | Parameter | Type | Required | Description |
 |-----------|------|----------|-------------|
 | `information` | string | yes | Text to store |
 | `project` | string | no | Project name to tag this memory with. Default: `"global"`. Use the project name (e.g. `"my-app"`) for project-specific knowledge, or `"global"` for cross-project knowledge. |
 | `metadata` | JSON | no | Extra metadata stored alongside the information |
 | `collection_name` | string | depends | Collection name. Required if no default is configured. |
 ### `qdrant-find`
 Retrieve relevant information from the Qdrant database.
 | Parameter | Type | Required | Description |
 |-----------|------|----------|-------------|
 | `query` | string | yes | What to search for |
 | `collection_name` | string | depends | Collection name. Required if no default is configured. |
 ## Environment Variables
-The configuration of the server is done using environment variables:
+| Name | Description | Default |
-
+|------|-------------|---------|
-| Name                     | Description                                                         | Default Value                                                     |
+| `QDRANT_URL` | URL of the Qdrant server | None |
-|--------------------------|---------------------------------------------------------------------|-------------------------------------------------------------------|
+| `QDRANT_API_KEY` | API key for the Qdrant server | None |
-| `QDRANT_URL`             | URL of the Qdrant server                                            | None                                                              |
+| `QDRANT_LOCAL_PATH` | Path to local Qdrant database (alternative to `QDRANT_URL`) | None |
-| `QDRANT_API_KEY`         | API key for the Qdrant server                                       | None                                                              |
+| `COLLECTION_NAME` | Default collection name | None |
-| `COLLECTION_NAME`        | Name of the default collection to use.                              | None                                                              |
+| `EMBEDDING_PROVIDER` | Embedding provider (currently only `fastembed`) | `fastembed` |
-| `QDRANT_LOCAL_PATH`      | Path to the local Qdrant database (alternative to `QDRANT_URL`)     | None                                                              |
+| `EMBEDDING_MODEL` | Embedding model name | `sentence-transformers/all-MiniLM-L6-v2` |
-| `EMBEDDING_PROVIDER`     | Embedding provider to use (currently only "fastembed" is supported) | `fastembed`                                                       |
+| **`HYBRID_SEARCH`** | **Enable hybrid search (dense + BM25 sparse with RRF)** | **`false`** |
-| `EMBEDDING_MODEL`        | Name of the embedding model to use                                  | `sentence-transformers/all-MiniLM-L6-v2`                          |
+| `QDRANT_SEARCH_LIMIT` | Maximum number of results per search | `10` |
-| `TOOL_STORE_DESCRIPTION` | Custom description for the store tool                               | See default in [`settings.py`](src/mcp_server_qdrant/settings.py) |
+| `QDRANT_READ_ONLY` | Disable write operations (store tool) | `false` |
-| `TOOL_FIND_DESCRIPTION`  | Custom description for the find tool                                | See default in [`settings.py`](src/mcp_server_qdrant/settings.py) |
+| `QDRANT_ALLOW_ARBITRARY_FILTER` | Allow arbitrary filter objects in find queries | `false` |
-
+| `TOOL_STORE_DESCRIPTION` | Custom description for the store tool | See [`settings.py`](src/mcp_server_qdrant/settings.py) |
-Note: You cannot provide both `QDRANT_URL` and `QDRANT_LOCAL_PATH` at the same time.
+| `TOOL_FIND_DESCRIPTION` | Custom description for the find tool | See [`settings.py`](src/mcp_server_qdrant/settings.py) |
 > [!IMPORTANT]
-> Command-line arguments are not supported anymore! Please use environment variables for all configuration.
+> You cannot provide both `QDRANT_URL` and `QDRANT_LOCAL_PATH` at the same time.
 ### FastMCP Environment Variables
-Since `mcp-server-qdrant` is based on FastMCP, it also supports all the FastMCP environment variables. The most
+Since `mcp-server-qdrant` is based on FastMCP, it also supports all FastMCP environment variables:
 important ones are listed below:
-| Environment Variable                  | Description                                               | Default Value |
+| Name | Description | Default |
-|---------------------------------------|-----------------------------------------------------------|---------------|
+|------|-------------|---------|
-| `FASTMCP_DEBUG`                       | Enable debug mode                                         | `false`       |
+| `FASTMCP_DEBUG` | Enable debug mode | `false` |
-| `FASTMCP_LOG_LEVEL`                   | Set logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) | `INFO`        |
+| `FASTMCP_LOG_LEVEL` | Log level (DEBUG, INFO, WARNING, ERROR, CRITICAL) | `INFO` |
-| `FASTMCP_HOST`                        | Host address to bind the server to                        | `127.0.0.1`   |
+| `FASTMCP_HOST` | Host address to bind to | `127.0.0.1` |
-| `FASTMCP_PORT`                        | Port to run the server on                                 | `8000`        |
+| `FASTMCP_PORT` | Port to run the server on | `8000` |
 | `FASTMCP_WARN_ON_DUPLICATE_RESOURCES` | Show warnings for duplicate resources                     | `true`        |
 | `FASTMCP_WARN_ON_DUPLICATE_TOOLS`     | Show warnings for duplicate tools                         | `true`        |
 | `FASTMCP_WARN_ON_DUPLICATE_PROMPTS`   | Show warnings for duplicate prompts                       | `true`        |
 | `FASTMCP_DEPENDENCIES`                | List of dependencies to install in the server environment | `[]`          |
 ## Installation
 ### Using uvx
-When using [`uvx`](https://docs.astral.sh/uv/guides/tools/#running-tools) no specific installation is needed to directly run *mcp-server-qdrant*.
+No installation needed with [`uvx`](https://docs.astral.sh/uv/guides/tools/#running-tools):
 ```shell
 QDRANT_URL="http://localhost:6333" \
 COLLECTION_NAME="my-collection" \
-EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2" \
+HYBRID_SEARCH=true \
 uvx mcp-server-qdrant
 ```
 #### Transport Protocols
 The server supports different transport protocols that can be specified using the `--transport` flag:
 ```shell
 # SSE transport (for remote clients)
 QDRANT_URL="http://localhost:6333" \
 COLLECTION_NAME="my-collection" \
 HYBRID_SEARCH=true \
 uvx mcp-server-qdrant --transport sse
 ```
-Supported transport protocols:
+Supported transports:
-
+- `stdio` (default) — for local MCP clients
- `stdio` (default): Standard input/output transport, might only be used by local MCP clients
+- `sse` — Server-Sent Events, for remote clients
- `sse`: Server-Sent Events transport, perfect for remote clients
+- `streamable-http` — streamable HTTP, newer alternative to SSE
 - `streamable-http`: Streamable HTTP transport, perfect for remote clients, more recent than SSE
 The default transport is `stdio` if not specified.
 When SSE transport is used, the server will listen on the specified port and wait for incoming connections. The default
 port is 8000, however it can be changed using the `FASTMCP_PORT` environment variable.
 ```shell
 QDRANT_URL="http://localhost:6333" \
 COLLECTION_NAME="my-collection" \
 FASTMCP_PORT=1234 \
 uvx mcp-server-qdrant --transport sse
 ```
 ### Using Docker
 A Dockerfile is available for building and running the MCP server:
 ```bash
 # Build the container
 docker build -t mcp-server-qdrant .
 # Run the container
 docker run -p 8000:8000 \
  -e FASTMCP_HOST="0.0.0.0" \
  -e QDRANT_URL="http://your-qdrant-server:6333" \
  -e QDRANT_API_KEY="your-api-key" \
  -e COLLECTION_NAME="your-collection" \
  -e HYBRID_SEARCH=true \
  mcp-server-qdrant
 ```
 > [!TIP]
 > Please note that we set `FASTMCP_HOST="0.0.0.0"` to make the server listen on all network interfaces. This is
 > necessary when running the server in a Docker container.
 ### Installing via Smithery
 To install Qdrant MCP Server for Claude Desktop automatically via [Smithery](https://smithery.ai/protocol/mcp-server-qdrant):
 ```bash
 npx @smithery/cli install mcp-server-qdrant --client claude
 ```
-### Manual configuration of Claude Desktop
+## Usage with MCP Clients
-To use this server with the Claude Desktop app, add the following configuration to the "mcpServers" section of your
+### Claude Desktop
-`claude_desktop_config.json`:
+
 Add to `claude_desktop_config.json`:
 ```json
 {
@@ -151,171 +172,64 @@ To use this server with the Claude Desktop app, add the following configuration
    "command": "uvx",
    "args": ["mcp-server-qdrant"],
    "env": {
-      "QDRANT_URL": "https://xyz-example.eu-central.aws.cloud.qdrant.io:6333",
+      "QDRANT_URL": "https://your-qdrant-instance:6333",
      "QDRANT_API_KEY": "your_api_key",
-      "COLLECTION_NAME": "your-collection-name",
+      "COLLECTION_NAME": "your-collection",
-      "EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2"
+      "EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2",
      "HYBRID_SEARCH": "true"
    }
  }
 }
 ```
-For local Qdrant mode:
+### Claude Code
-```json
+```shell
-{
+claude mcp add qdrant-memory \
-  "qdrant": {
+  -e QDRANT_URL="http://localhost:6333" \
-    "command": "uvx",
+  -e COLLECTION_NAME="my-memory" \
-    "args": ["mcp-server-qdrant"],
+  -e HYBRID_SEARCH="true" \
-    "env": {
+  -- uvx mcp-server-qdrant
      "QDRANT_LOCAL_PATH": "/path/to/qdrant/database",
      "COLLECTION_NAME": "your-collection-name",
      "EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2"
    }
  }
 }
 ```
-This MCP server will automatically create a collection with the specified name if it doesn't exist.
+Verify:
-By default, the server will use the `sentence-transformers/all-MiniLM-L6-v2` embedding model to encode memories.
+```shell
-For the time being, only [FastEmbed](https://qdrant.github.io/fastembed/) models are supported.
+claude mcp list
 ```
-## Support for other tools
+### Cursor / Windsurf
-This MCP server can be used with any MCP-compatible client. For example, you can use it with
+Run the server with SSE transport and custom tool descriptions for code search:
 [Cursor](https://docs.cursor.com/context/model-context-protocol) and [VS Code](https://code.visualstudio.com/docs), which provide built-in support for the Model Context
 Protocol.
 ### Using with Cursor/Windsurf
 You can configure this MCP server to work as a code search tool for Cursor or Windsurf by customizing the tool
 descriptions:
 ```bash
 QDRANT_URL="http://localhost:6333" \
 COLLECTION_NAME="code-snippets" \
 HYBRID_SEARCH=true \
 TOOL_STORE_DESCRIPTION="Store reusable code snippets for later retrieval. \
 The 'information' parameter should contain a natural language description of what the code does, \
-while the actual code should be included in the 'metadata' parameter as a 'code' property. \
+while the actual code should be included in the 'metadata' parameter as a 'code' property." \
-The value of 'metadata' is a Python dictionary with strings as keys. \
+TOOL_FIND_DESCRIPTION="Search for relevant code snippets based on natural language descriptions." \
-Use this whenever you generate some code snippet." \
+uvx mcp-server-qdrant --transport sse
 TOOL_FIND_DESCRIPTION="Search for relevant code snippets based on natural language descriptions. \
 The 'query' parameter should describe what you're looking for, \
 and the tool will return the most relevant code snippets. \
 Use this when you need to find existing code snippets for reuse or reference." \
 uvx mcp-server-qdrant --transport sse # Enable SSE transport
 ```
-In Cursor/Windsurf, you can then configure the MCP server in your settings by pointing to this running server using
+Then point Cursor/Windsurf to `http://localhost:8000/sse`.
 SSE transport protocol. The description on how to add an MCP server to Cursor can be found in the [Cursor
 documentation](https://docs.cursor.com/context/model-context-protocol#adding-an-mcp-server-to-cursor). If you are
 running Cursor/Windsurf locally, you can use the following URL:
-```
+### VS Code
 http://localhost:8000/sse
 ```
 > [!TIP]
 > We suggest SSE transport as a preferred way to connect Cursor/Windsurf to the MCP server, as it can support remote
 > connections. That makes it easy to share the server with your team or use it in a cloud environment.
 This configuration transforms the Qdrant MCP server into a specialized code search tool that can:
 1. Store code snippets, documentation, and implementation details
 2. Retrieve relevant code examples based on semantic search
 3. Help developers find specific implementations or usage patterns
 You can populate the database by storing natural language descriptions of code snippets (in the `information` parameter)
 along with the actual code (in the `metadata.code` property), and then search for them using natural language queries
 that describe what you're looking for.
 > [!NOTE]
 > The tool descriptions provided above are examples and may need to be customized for your specific use case. Consider
 > adjusting the descriptions to better match your team's workflow and the specific types of code snippets you want to
 > store and retrieve.
 **If you have successfully installed the `mcp-server-qdrant`, but still can't get it to work with Cursor, please
 consider creating the [Cursor rules](https://docs.cursor.com/context/rules-for-ai) so the MCP tools are always used when
 the agent produces a new code snippet.** You can restrict the rules to only work for certain file types, to avoid using
 the MCP server for the documentation or other types of content.
 ### Using with Claude Code
 You can enhance Claude Code's capabilities by connecting it to this MCP server, enabling semantic search over your
 existing codebase.
 #### Setting up mcp-server-qdrant
 1. Add the MCP server to Claude Code:
    ```shell
    # Add mcp-server-qdrant configured for code search
    claude mcp add code-search \
    -e QDRANT_URL="http://localhost:6333" \
    -e COLLECTION_NAME="code-repository" \
    -e EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2" \
    -e TOOL_STORE_DESCRIPTION="Store code snippets with descriptions. The 'information' parameter should contain a natural language description of what the code does, while the actual code should be included in the 'metadata' parameter as a 'code' property." \
    -e TOOL_FIND_DESCRIPTION="Search for relevant code snippets using natural language. The 'query' parameter should describe the functionality you're looking for." \
    -- uvx mcp-server-qdrant
    ```
 2. Verify the server was added:
    ```shell
    claude mcp list
    ```
 #### Using Semantic Code Search in Claude Code
 Tool descriptions, specified in `TOOL_STORE_DESCRIPTION` and `TOOL_FIND_DESCRIPTION`, guide Claude Code on how to use
 the MCP server. The ones provided above are examples and may need to be customized for your specific use case. However,
 Claude Code should be already able to:
 1. Use the `qdrant-store` tool to store code snippets with descriptions.
 2. Use the `qdrant-find` tool to search for relevant code snippets using natural language.
 ### Run MCP server in Development Mode
 The MCP server can be run in development mode using the `mcp dev` command. This will start the server and open the MCP
 inspector in your browser.
 ```shell
 COLLECTION_NAME=mcp-dev fastmcp dev src/mcp_server_qdrant/server.py
 ```
 ### Using with VS Code
 For one-click installation, click one of the install buttons below:
 [![Install with UVX in VS Code](https://img.shields.io/badge/VS_Code-UVX-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=qdrant&config=%7B%22command%22%3A%22uvx%22%2C%22args%22%3A%5B%22mcp-server-qdrant%22%5D%2C%22env%22%3A%7B%22QDRANT_URL%22%3A%22%24%7Binput%3AqdrantUrl%7D%22%2C%22QDRANT_API_KEY%22%3A%22%24%7Binput%3AqdrantApiKey%7D%22%2C%22COLLECTION_NAME%22%3A%22%24%7Binput%3AcollectionName%7D%22%7D%7D&inputs=%5B%7B%22type%22%3A%22promptString%22%2C%22id%22%3A%22qdrantUrl%22%2C%22description%22%3A%22Qdrant+URL%22%7D%2C%7B%22type%22%3A%22promptString%22%2C%22id%22%3A%22qdrantApiKey%22%2C%22description%22%3A%22Qdrant+API+Key%22%2C%22password%22%3Atrue%7D%2C%7B%22type%22%3A%22promptString%22%2C%22id%22%3A%22collectionName%22%2C%22description%22%3A%22Collection+Name%22%7D%5D) [![Install with UVX in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-UVX-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=qdrant&config=%7B%22command%22%3A%22uvx%22%2C%22args%22%3A%5B%22mcp-server-qdrant%22%5D%2C%22env%22%3A%7B%22QDRANT_URL%22%3A%22%24%7Binput%3AqdrantUrl%7D%22%2C%22QDRANT_API_KEY%22%3A%22%24%7Binput%3AqdrantApiKey%7D%22%2C%22COLLECTION_NAME%22%3A%22%24%7Binput%3AcollectionName%7D%22%7D%7D&inputs=%5B%7B%22type%22%3A%22promptString%22%2C%22id%22%3A%22qdrantUrl%22%2C%22description%22%3A%22Qdrant+URL%22%7D%2C%7B%22type%22%3A%22promptString%22%2C%22id%22%3A%22qdrantApiKey%22%2C%22description%22%3A%22Qdrant+API+Key%22%2C%22password%22%3Atrue%7D%2C%7B%22type%22%3A%22promptString%22%2C%22id%22%3A%22collectionName%22%2C%22description%22%3A%22Collection+Name%22%7D%5D&quality=insiders)
-[![Install with Docker in VS Code](https://img.shields.io/badge/VS_Code-Docker-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=qdrant&config=%7B%22command%22%3A%22docker%22%2C%22args%22%3A%5B%22run%22%2C%22-p%22%2C%228000%3A8000%22%2C%22-i%22%2C%22--rm%22%2C%22-e%22%2C%22QDRANT_URL%22%2C%22-e%22%2C%22QDRANT_API_KEY%22%2C%22-e%22%2C%22COLLECTION_NAME%22%2C%22mcp-server-qdrant%22%5D%2C%22env%22%3A%7B%22QDRANT_URL%22%3A%22%24%7Binput%3AqdrantUrl%7D%22%2C%22QDRANT_API_KEY%22%3A%22%24%7Binput%3AqdrantApiKey%7D%22%2C%22COLLECTION_NAME%22%3A%22%24%7Binput%3AcollectionName%7D%22%7D%7D&inputs=%5B%7B%22type%22%3A%22promptString%22%2C%22id%22%3A%22qdrantUrl%22%2C%22description%22%3A%22Qdrant+URL%22%7D%2C%7B%22type%22%3A%22promptString%22%2C%22id%22%3A%22qdrantApiKey%22%2C%22description%22%3A%22Qdrant+API+Key%22%2C%22password%22%3Atrue%7D%2C%7B%22type%22%3A%22promptString%22%2C%22id%22%3A%22collectionName%22%2C%22description%22%3A%22Collection+Name%22%7D%5D) [![Install with Docker in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Docker-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=qdrant&config=%7B%22command%22%3A%22docker%22%2C%22args%22%3A%5B%22run%22%2C%22-p%22%2C%228000%3A8000%22%2C%22-i%22%2C%22--rm%22%2C%22-e%22%2C%22QDRANT_URL%22%2C%22-e%22%2C%22QDRANT_API_KEY%22%2C%22-e%22%2C%22COLLECTION_NAME%22%2C%22mcp-server-qdrant%22%5D%2C%22env%22%3A%7B%22QDRANT_URL%22%3A%22%24%7Binput%3AqdrantUrl%7D%22%2C%22QDRANT_API_KEY%22%3A%22%24%7Binput%3AqdrantApiKey%7D%22%2C%22COLLECTION_NAME%22%3A%22%24%7Binput%3AcollectionName%7D%22%7D%7D&inputs=%5B%7B%22type%22%3A%22promptString%22%2C%22id%22%3A%22qdrantUrl%22%2C%22description%22%3A%22Qdrant+URL%22%7D%2C%7B%22type%22%3A%22promptString%22%2C%22id%22%3A%22qdrantApiKey%22%2C%22description%22%3A%22Qdrant+API+Key%22%2C%22password%22%3Atrue%7D%2C%7B%22type%22%3A%22promptString%22%2C%22id%22%3A%22collectionName%22%2C%22description%22%3A%22Collection+Name%22%7D%5D&quality=insiders)
+Or add manually to VS Code settings (`Ctrl+Shift+P` → `Preferences: Open User Settings (JSON)`):
 #### Manual Installation
 Add the following JSON block to your User Settings (JSON) file in VS Code. You can do this by pressing `Ctrl + Shift + P` and typing `Preferences: Open User Settings (JSON)`.
 ```json
 {
  "mcp": {
    "inputs": [
-      {
+      {"type": "promptString", "id": "qdrantUrl", "description": "Qdrant URL"},
-        "type": "promptString",
+      {"type": "promptString", "id": "qdrantApiKey", "description": "Qdrant API Key", "password": true},
-        "id": "qdrantUrl",
+      {"type": "promptString", "id": "collectionName", "description": "Collection Name"}
        "description": "Qdrant URL"
      },
      {
        "type": "promptString",
        "id": "qdrantApiKey",
        "description": "Qdrant API Key",
        "password": true
      },
      {
        "type": "promptString",
        "id": "collectionName",
        "description": "Collection Name"
      }
    ],
    "servers": {
      "qdrant": {
@@ -324,7 +238,8 @@ Add the following JSON block to your User Settings (JSON) file in VS Code. You c
        "env": {
          "QDRANT_URL": "${input:qdrantUrl}",
          "QDRANT_API_KEY": "${input:qdrantApiKey}",
-          "COLLECTION_NAME": "${input:collectionName}"
+          "COLLECTION_NAME": "${input:collectionName}",
          "HYBRID_SEARCH": "true"
        }
      }
    }
@@ -332,154 +247,40 @@ Add the following JSON block to your User Settings (JSON) file in VS Code. You c
 }
 ```
-Or if you prefer using Docker, add this configuration instead:
+## Development
-```json
+Run in development mode with the MCP inspector:
 {
  "mcp": {
    "inputs": [
      {
        "type": "promptString",
        "id": "qdrantUrl",
        "description": "Qdrant URL"
      },
      {
        "type": "promptString",
        "id": "qdrantApiKey",
        "description": "Qdrant API Key",
        "password": true
      },
      {
        "type": "promptString",
        "id": "collectionName",
        "description": "Collection Name"
      }
    ],
    "servers": {
      "qdrant": {
        "command": "docker",
        "args": [
          "run",
          "-p", "8000:8000",
          "-i",
          "--rm",
          "-e", "QDRANT_URL",
          "-e", "QDRANT_API_KEY",
          "-e", "COLLECTION_NAME",
          "mcp-server-qdrant"
        ],
        "env": {
          "QDRANT_URL": "${input:qdrantUrl}",
          "QDRANT_API_KEY": "${input:qdrantApiKey}",
          "COLLECTION_NAME": "${input:collectionName}"
        }
      }
    }
  }
 }
 ```
 Alternatively, you can create a `.vscode/mcp.json` file in your workspace with the following content:
 ```json
 {
  "inputs": [
    {
      "type": "promptString",
      "id": "qdrantUrl",
      "description": "Qdrant URL"
    },
    {
      "type": "promptString",
      "id": "qdrantApiKey",
      "description": "Qdrant API Key",
      "password": true
    },
    {
      "type": "promptString",
      "id": "collectionName",
      "description": "Collection Name"
    }
  ],
  "servers": {
    "qdrant": {
      "command": "uvx",
      "args": ["mcp-server-qdrant"],
      "env": {
        "QDRANT_URL": "${input:qdrantUrl}",
        "QDRANT_API_KEY": "${input:qdrantApiKey}",
        "COLLECTION_NAME": "${input:collectionName}"
      }
    }
  }
 }
 ```
 For workspace configuration with Docker, use this in `.vscode/mcp.json`:
 ```json
 {
  "inputs": [
    {
      "type": "promptString",
      "id": "qdrantUrl",
      "description": "Qdrant URL"
    },
    {
      "type": "promptString",
      "id": "qdrantApiKey",
      "description": "Qdrant API Key",
      "password": true
    },
    {
      "type": "promptString",
      "id": "collectionName",
      "description": "Collection Name"
    }
  ],
  "servers": {
    "qdrant": {
      "command": "docker",
      "args": [
        "run",
        "-p", "8000:8000",
        "-i",
        "--rm",
        "-e", "QDRANT_URL",
        "-e", "QDRANT_API_KEY",
        "-e", "COLLECTION_NAME",
        "mcp-server-qdrant"
      ],
      "env": {
        "QDRANT_URL": "${input:qdrantUrl}",
        "QDRANT_API_KEY": "${input:qdrantApiKey}",
        "COLLECTION_NAME": "${input:collectionName}"
      }
    }
  }
 }
 ```
 ## Contributing
 If you have suggestions for how mcp-server-qdrant could be improved, or want to report a bug, open an issue!
 We'd love all and any contributions.
 ### Testing `mcp-server-qdrant` locally
 The [MCP inspector](https://github.com/modelcontextprotocol/inspector) is a developer tool for testing and debugging MCP
 servers. It runs both a client UI (default port 5173) and an MCP proxy server (default port 3000). Open the client UI in
 your browser to use the inspector.
 ```shell
-QDRANT_URL=":memory:" COLLECTION_NAME="test" \
+COLLECTION_NAME=mcp-dev HYBRID_SEARCH=true \
 fastmcp dev src/mcp_server_qdrant/server.py
 ```
-Once started, open your browser to http://localhost:5173 to access the inspector interface.
+Open http://localhost:5173 to access the inspector.
 ## How Hybrid Search Works Under the Hood
 When `HYBRID_SEARCH=true`:
 **Storing:**
 1. The document is embedded with the dense model (e.g. `all-MiniLM-L6-v2`) → semantic vector
 2. The document is also embedded with `Qdrant/bm25` → sparse vector (term frequencies with IDF)
 3. Both vectors are stored in the same Qdrant point
 **Searching:**
 1. The query is embedded with both models
 2. Two independent prefetch queries run in parallel:
   - Dense vector search (cosine similarity)
   - BM25 sparse vector search (dot product with IDF weighting)
 3. Results are fused using **Reciprocal Rank Fusion**: `score = 1/(k + rank_dense) + 1/(k + rank_sparse)`
 4. Top-k fused results are returned
 This approach is battle-tested in information retrieval and consistently outperforms either method alone, especially for queries that mix natural language with specific terms.
 ## Acknowledgments
 This is a fork of [qdrant/mcp-server-qdrant](https://github.com/qdrant/mcp-server-qdrant). All credit for the original implementation goes to the Qdrant team.
 ## License
-This MCP server is licensed under the Apache License 2.0. This means you are free to use, modify, and distribute the
+Apache License 2.0 — see [LICENSE](LICENSE) for details.
 software, subject to the terms and conditions of the Apache License 2.0. For more details, please see the LICENSE file
 in the project repository.
--- a/src/mcp_server_qdrant/embeddings/base.py
+++ b/src/mcp_server_qdrant/embeddings/base.py
@@ -1,4 +1,13 @@
 from abc import ABC, abstractmethod
 from dataclasses import dataclass
@dataclass
 class SparseVector:
    """A sparse vector representation with indices and values."""
    indices: list[int]
    values: list[float]
 class EmbeddingProvider(ABC):
@@ -23,3 +32,15 @@ class EmbeddingProvider(ABC):
    def get_vector_size(self) -> int:
        """Get the size of the vector for the Qdrant collection."""
        pass
    def supports_sparse(self) -> bool:
        """Whether this provider supports sparse (BM25) embeddings."""
        return False
    async def embed_documents_sparse(self, documents: list[str]) -> list[SparseVector]:
        """Embed documents into sparse vectors. Override if supports_sparse() is True."""
        raise NotImplementedError
    async def embed_query_sparse(self, query: str) -> SparseVector:
        """Embed a query into a sparse vector. Override if supports_sparse() is True."""
        raise NotImplementedError
--- a/src/mcp_server_qdrant/embeddings/factory.py
+++ b/src/mcp_server_qdrant/embeddings/factory.py
@@ -3,15 +3,18 @@ from mcp_server_qdrant.embeddings.types import EmbeddingProviderType
 from mcp_server_qdrant.settings import EmbeddingProviderSettings
-def create_embedding_provider(settings: EmbeddingProviderSettings) -> EmbeddingProvider:
+def create_embedding_provider(
    settings: EmbeddingProviderSettings, enable_sparse: bool = False
 ) -> EmbeddingProvider:
    """
    Create an embedding provider based on the specified type.
    :param settings: The settings for the embedding provider.
    :param enable_sparse: Whether to enable sparse (BM25) embeddings.
    :return: An instance of the specified embedding provider.
    """
    if settings.provider_type == EmbeddingProviderType.FASTEMBED:
        from mcp_server_qdrant.embeddings.fastembed import FastEmbedProvider
-        return FastEmbedProvider(settings.model_name)
+        return FastEmbedProvider(settings.model_name, enable_sparse=enable_sparse)
    else:
        raise ValueError(f"Unsupported embedding provider: {settings.provider_type}")
--- a/src/mcp_server_qdrant/embeddings/fastembed.py
+++ b/src/mcp_server_qdrant/embeddings/fastembed.py
@@ -1,24 +1,31 @@
 import asyncio
-from fastembed import TextEmbedding
+from fastembed import SparseTextEmbedding, TextEmbedding
 from fastembed.common.model_description import DenseModelDescription
-from mcp_server_qdrant.embeddings.base import EmbeddingProvider
+from mcp_server_qdrant.embeddings.base import EmbeddingProvider, SparseVector
 class FastEmbedProvider(EmbeddingProvider):
    """
    FastEmbed implementation of the embedding provider.
    :param model_name: The name of the FastEmbed model to use.
    :param enable_sparse: Whether to enable BM25 sparse embeddings for hybrid search.
    """
-    def __init__(self, model_name: str):
+    def __init__(self, model_name: str, enable_sparse: bool = False):
        self.model_name = model_name
        self.embedding_model = TextEmbedding(model_name)
        self._enable_sparse = enable_sparse
        self._sparse_model = None
        if enable_sparse:
            self._sparse_model = SparseTextEmbedding("Qdrant/bm25")
    def supports_sparse(self) -> bool:
        return self._enable_sparse and self._sparse_model is not None
    async def embed_documents(self, documents: list[str]) -> list[list[float]]:
        """Embed a list of documents into vectors."""
        # Run in a thread pool since FastEmbed is synchronous
        loop = asyncio.get_event_loop()
        embeddings = await loop.run_in_executor(
            None, lambda: list(self.embedding_model.passage_embed(documents))
@@ -27,13 +34,37 @@ class FastEmbedProvider(EmbeddingProvider):
    async def embed_query(self, query: str) -> list[float]:
        """Embed a query into a vector."""
        # Run in a thread pool since FastEmbed is synchronous
        loop = asyncio.get_event_loop()
        embeddings = await loop.run_in_executor(
            None, lambda: list(self.embedding_model.query_embed([query]))
        )
        return embeddings[0].tolist()
    async def embed_documents_sparse(self, documents: list[str]) -> list[SparseVector]:
        """Embed documents into BM25 sparse vectors."""
        loop = asyncio.get_event_loop()
        results = await loop.run_in_executor(
            None, lambda: list(self._sparse_model.passage_embed(documents))
        )
        return [
            SparseVector(
                indices=r.indices.tolist(),
                values=r.values.tolist(),
            )
            for r in results
        ]
    async def embed_query_sparse(self, query: str) -> SparseVector:
        """Embed a query into a BM25 sparse vector."""
        loop = asyncio.get_event_loop()
        results = await loop.run_in_executor(
            None, lambda: list(self._sparse_model.query_embed([query]))
        )
        return SparseVector(
            indices=results[0].indices.tolist(),
            values=results[0].values.tolist(),
        )
    def get_vector_name(self) -> str:
        """
        Return the name of the vector for the Qdrant collection.
--- a/src/mcp_server_qdrant/mcp_server.py
+++ b/src/mcp_server_qdrant/mcp_server.py
@@ -57,7 +57,8 @@ class QdrantMCPServer(FastMCP):
        if embedding_provider_settings:
            self.embedding_provider_settings = embedding_provider_settings
            self.embedding_provider = create_embedding_provider(
-                embedding_provider_settings
+                embedding_provider_settings,
                enable_sparse=qdrant_settings.hybrid_search,
            )
        else:
            self.embedding_provider_settings = None
@@ -72,6 +73,7 @@ class QdrantMCPServer(FastMCP):
            self.embedding_provider,
            qdrant_settings.local_path,
            make_indexes(qdrant_settings.filterable_fields_dict()),
            hybrid_search=qdrant_settings.hybrid_search,
        )
        super().__init__(name=name, instructions=instructions, **settings)
@@ -96,6 +98,13 @@ class QdrantMCPServer(FastMCP):
            collection_name: Annotated[
                str, Field(description="The collection to store the information in")
            ],
            project: Annotated[
                str,
                Field(
                    description="Project name, e.g. devops, stereo-hysteria, voice-assistant. "
                    "Use 'global' for cross-project knowledge (servers, network, user preferences)."
                ),
            ] = "global",
            # The `metadata` parameter is defined as non-optional, but it can be None.
            # If we set it to be optional, some of the MCP clients, like Cursor, cannot
            # handle the optional parameter correctly.
@@ -110,12 +119,17 @@ class QdrantMCPServer(FastMCP):
            Store some information in Qdrant.
            :param ctx: The context for the request.
            :param information: The information to store.
            :param project: The project name to tag this memory with.
            :param metadata: JSON metadata to store with the information, optional.
            :param collection_name: The name of the collection to store the information in, optional. If not provided,
                                    the default collection is used.
            :return: A message indicating that the information was stored.
            """
-            await ctx.debug(f"Storing information {information} in Qdrant")
+            await ctx.debug(f"Storing information {information} in Qdrant (project={project})")
            if metadata is None:
                metadata = {}
            metadata["project"] = project
            entry = Entry(content=information, metadata=metadata)
--- a/src/mcp_server_qdrant/qdrant.py
+++ b/src/mcp_server_qdrant/qdrant.py
@@ -23,6 +23,9 @@ class Entry(BaseModel):
    metadata: Metadata | None = None
 SPARSE_VECTOR_NAME = "bm25"
 class QdrantConnector:
    """
    Encapsulates the connection to a Qdrant server and all the methods to interact with it.
@@ -32,6 +35,7 @@ class QdrantConnector:
                            the collection name to be provided.
    :param embedding_provider: The embedding provider to use.
    :param qdrant_local_path: The path to the storage directory for the Qdrant client, if local mode is used.
    :param hybrid_search: Whether to enable hybrid search (dense + BM25 sparse vectors with RRF).
    """
    def __init__(
@@ -42,15 +46,19 @@ class QdrantConnector:
        embedding_provider: EmbeddingProvider,
        qdrant_local_path: str | None = None,
        field_indexes: dict[str, models.PayloadSchemaType] | None = None,
        hybrid_search: bool = False,
    ):
        self._qdrant_url = qdrant_url.rstrip("/") if qdrant_url else None
        self._qdrant_api_key = qdrant_api_key
        self._default_collection_name = collection_name
        self._embedding_provider = embedding_provider
        self._hybrid_search = hybrid_search and embedding_provider.supports_sparse()
        self._client = AsyncQdrantClient(
            location=qdrant_url, api_key=qdrant_api_key, path=qdrant_local_path
        )
        self._field_indexes = field_indexes
        if self._hybrid_search:
            logger.info("Hybrid search enabled (dense + BM25 sparse vectors with RRF)")
    async def get_collection_names(self) -> list[str]:
        """
@@ -72,19 +80,30 @@ class QdrantConnector:
        await self._ensure_collection_exists(collection_name)
        # Embed the document
        # ToDo: instead of embedding text explicitly, use `models.Document`,
        # it should unlock usage of server-side inference.
        embeddings = await self._embedding_provider.embed_documents([entry.content])
-        # Add to Qdrant
+        # Build vector dict
        vector_name = self._embedding_provider.get_vector_name()
        vector_data: dict = {vector_name: embeddings[0]}
        # Add sparse vector if hybrid search is enabled
        if self._hybrid_search:
            sparse_embeddings = await self._embedding_provider.embed_documents_sparse(
                [entry.content]
            )
            sparse = sparse_embeddings[0]
            vector_data[SPARSE_VECTOR_NAME] = models.SparseVector(
                indices=sparse.indices, values=sparse.values
            )
        # Add to Qdrant
        payload = {"document": entry.content, METADATA_PATH: entry.metadata}
        await self._client.upsert(
            collection_name=collection_name,
            points=[
                models.PointStruct(
                    id=uuid.uuid4().hex,
-                    vector={vector_name: embeddings[0]},
+                    vector=vector_data,
                    payload=payload,
                )
            ],
@@ -113,21 +132,43 @@ class QdrantConnector:
        if not collection_exists:
            return []
        # Embed the query
        # ToDo: instead of embedding text explicitly, use `models.Document`,
        # it should unlock usage of server-side inference.
        query_vector = await self._embedding_provider.embed_query(query)
        vector_name = self._embedding_provider.get_vector_name()
-        # Search in Qdrant
+        # Hybrid search: prefetch dense + sparse, fuse with RRF
-        search_results = await self._client.query_points(
+        if self._hybrid_search:
-            collection_name=collection_name,
+            sparse_vector = await self._embedding_provider.embed_query_sparse(query)
-            query=query_vector,
+            search_results = await self._client.query_points(
-            using=vector_name,
+                collection_name=collection_name,
-            limit=limit,
+                prefetch=[
-            query_filter=query_filter,
+                    models.Prefetch(
-        )
+                        query=query_vector,
                        using=vector_name,
                        limit=limit,
                        filter=query_filter,
                    ),
                    models.Prefetch(
                        query=models.SparseVector(
                            indices=sparse_vector.indices,
                            values=sparse_vector.values,
                        ),
                        using=SPARSE_VECTOR_NAME,
                        limit=limit,
                        filter=query_filter,
                    ),
                ],
                query=models.FusionQuery(fusion=models.Fusion.RRF),
                limit=limit,
            )
        else:
            # Dense-only search (original behavior)
            search_results = await self._client.query_points(
                collection_name=collection_name,
                query=query_vector,
                using=vector_name,
                limit=limit,
                query_filter=query_filter,
            )
        return [
            Entry(
@@ -149,6 +190,16 @@ class QdrantConnector:
            # Use the vector name as defined in the embedding provider
            vector_name = self._embedding_provider.get_vector_name()
            # Sparse vectors config for hybrid search (BM25)
            sparse_config = None
            if self._hybrid_search:
                sparse_config = {
                    SPARSE_VECTOR_NAME: models.SparseVectorParams(
                        modifier=models.Modifier.IDF,
                    )
                }
            await self._client.create_collection(
                collection_name=collection_name,
                vectors_config={
@@ -157,10 +208,17 @@ class QdrantConnector:
                        distance=models.Distance.COSINE,
                    )
                },
                sparse_vectors_config=sparse_config,
            )
            # Always index metadata.project for efficient filtering
            await self._client.create_payload_index(
                collection_name=collection_name,
                field_name="metadata.project",
                field_schema=models.PayloadSchemaType.KEYWORD,
            )
            # Create payload indexes if configured
            if self._field_indexes:
                for field_name, field_type in self._field_indexes.items():
                    await self._client.create_payload_index(
--- a/src/mcp_server_qdrant/settings.py
+++ b/src/mcp_server_qdrant/settings.py
@@ -78,6 +78,7 @@ class QdrantSettings(BaseSettings):
    location: str | None = Field(default=None, validation_alias="QDRANT_URL")
    api_key: str | None = Field(default=None, validation_alias="QDRANT_API_KEY")
    hybrid_search: bool = Field(default=False, validation_alias="HYBRID_SEARCH")
    collection_name: str | None = Field(
        default=None, validation_alias="COLLECTION_NAME"
    )
Author	SHA1	Message	Date
Mr. Kutin	8bcc45ee14	Update README: document hybrid search and project tagging features Some checks failed pre-commit / main (push) Has been cancelled Details Run Tests / Python 3.10 (push) Has been cancelled Details Run Tests / Python 3.11 (push) Has been cancelled Details Run Tests / Python 3.12 (push) Has been cancelled Details Run Tests / Python 3.13 (push) Has been cancelled Details Rewrite README to highlight the two fork-specific features: - BM25 hybrid search (dense + sparse vectors with RRF) - Automatic project tagging with metadata.project index Also update the environment variables table with all current options. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-03 19:58:39 +03:00
Mr. Kutin	e13a8981e7	Add project tagging to store tool and metadata.project index Some checks failed pre-commit / main (push) Has been cancelled Details Run Tests / Python 3.10 (push) Has been cancelled Details Run Tests / Python 3.11 (push) Has been cancelled Details Run Tests / Python 3.12 (push) Has been cancelled Details Run Tests / Python 3.13 (push) Has been cancelled Details - Add explicit `project` parameter to qdrant-store tool (default: "global") - Auto-inject project name into metadata for every stored record - Create keyword payload index on metadata.project for efficient filtering Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-03 10:43:05 +03:00
Mr. Kutin	e9f0a1fa4a	Add BM25 hybrid search (dense + sparse vectors with RRF) Some checks failed pre-commit / main (push) Has been cancelled Details Run Tests / Python 3.10 (push) Has been cancelled Details Run Tests / Python 3.11 (push) Has been cancelled Details Run Tests / Python 3.12 (push) Has been cancelled Details Run Tests / Python 3.13 (push) Has been cancelled Details - Add SparseTextEmbedding("Qdrant/bm25") to FastEmbedProvider for BM25 tokenization - Add sparse vector config (IDF modifier) to collection creation - Store both dense and sparse vectors per document - Use Qdrant prefetch + Reciprocal Rank Fusion for hybrid search - Add HYBRID_SEARCH env var (default: false) for backward compatibility Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-03 10:27:52 +03:00