Skip to main content
AI/MLruvnet

vector-search

Vector search via embeddings_* (large-scale HNSW) and ruvllm_hnsw_* (WASM router for ≤11 hot patterns), with RaBitQ 1-bit quantization for 32× memory reduction

Stars
56,726
Source
ruvnet/claude-flow
Updated
2026-05-31
Slug
ruvnet--claude-flow--vector-search
View on GitHubRaw SKILL.md

// install — copy + paste into any project

mkdir -p .claude/skills && curl -fsSL https://raw.githubusercontent.com/ruvnet/claude-flow/HEAD/plugins/ruflo-agentdb/skills/vector-search/SKILL.md -o .claude/skills/vector-search.md

Drops the SKILL.md into .claude/skills/vector-search.md. Works with Claude Code, Cursor, and any agent that loads SKILL.md files from .claude/skills/.

Vector Search

Two distinct vector-search paths live in this plugin. Pick the right one — they're not interchangeable.

Path Tool family Backing Capacity Latency
Large-scale corpus embeddings_* @claude-flow/memory HNSW (Rust/Native) up to millions of vectors 150×–12,500× faster than brute-force, depending on N and parameters
Hot-path router ruvllm_hnsw_* WASM-backed router (v2.0.1) ~11 patterns max (ruvllm-tools.ts:58) sub-ms; designed for high-priority routing, not corpus search

The "12,500×" headline applies to the large-scale embeddings_search path. The WASM router is not that path.

When to use

Need Path
Search a corpus of N ≥ 500 documents embeddings_search
Memory-constrained corpus (≥5,000 vectors) RaBitQ quantized — see "Quantized search" below
Compare two strings embeddings_compare
Hierarchical / taxonomic data embeddings_hyperbolic (Poincare ball)
Route a query to one of ≤11 hot patterns ruvllm_hnsw_route
Cross-namespace search memory_search_unified

Standard search

  1. Check statusmcp__claude-flow__embeddings_status to verify the embedding engine.
  2. Initializemcp__claude-flow__embeddings_init if not active.
  3. Generatemcp__claude-flow__embeddings_generate for text input.
  4. Searchmcp__claude-flow__embeddings_search with the query.
  5. Comparemcp__claude-flow__embeddings_compare to measure similarity.
  6. Unified searchmcp__claude-flow__memory_search_unified for cross-namespace.

Quantized search (32× memory reduction)

For corpora ≥5,000 vectors and/or memory-constrained environments, use the RaBitQ 1-bit quantization workflow. Below 5,000 vectors the rebuild cost outweighs the savings — use the standard path instead.

Step Tool Purpose
1 embeddings_init Engine warm
2 embeddings_rabitq_build One-time build of the 1-bit index after corpus is loaded
3 embeddings_rabitq_search Hamming-prefilter returns top-N candidate IDs (cheap)
4 embeddings_search Optional exact rerank on the candidate set (full-precision)
5 embeddings_rabitq_status Index health, memory footprint, build time

Note: embeddings_rabitq_search returns candidate IDs only — the rerank in step 4 is the user's responsibility (mirrors the docstring at embeddings-tools.ts:911). Without rerank, results are approximate; with rerank, you get full-precision quality at 32× lower memory.

Tuning

HNSW exposes three knobs that trade recall against latency. The "12,500×" headline assumes defaults; tune deliberately for your workload:

Profile efSearch M When to use
recall-first 200 32 Pattern recall during planning; quality matters more than ms
balanced (default) 64 16 General-purpose semantic recall
latency-first 16 8 Hot-path routing where p99 latency matters

efSearch is passed via ruvllm_hnsw_create (ruvllm-tools.ts:64). M is registry-level today; raise as a follow-up if it should be MCP-tunable. efConstruction defaults to 200 in the lite index (hnsw-index.ts:537).

HNSW pattern router (WASM, ≤11 patterns)

For routing a small number of high-priority patterns:

  • mcp__claude-flow__ruvllm_hnsw_create — create the WASM index (cap ~11)
  • mcp__claude-flow__ruvllm_hnsw_add — add a pattern
  • mcp__claude-flow__ruvllm_hnsw_route — route an incoming query

This is not a corpus index. Treat it as a fast classifier over a curated set of patterns.

Hyperbolic embeddings

For hierarchical data (code trees, org charts), use mcp__claude-flow__embeddings_hyperbolic which maps to Poincare ball space. Distance is geodesic, not cosine.

CLI alternative

npx @claude-flow/cli@latest embeddings search --query "authentication patterns"
npx @claude-flow/cli@latest embeddings init
npx @claude-flow/cli@latest memory search --query "your query"

Performance

Method Speed
Brute-force scan Baseline
HNSW (n=500, balanced) ~150× faster
HNSW (n=10,000, balanced) ~12,500× faster
RaBitQ + rerank (n=10,000) ~12,500× search speed at 32× lower memory
ruvllm_hnsw_route (n≤11) sub-ms per route, fixed cost