Similarity Search Patterns

Patterns for implementing efficient similarity search in production systems.

When to Use This Skill

Building semantic search systems
Implementing RAG retrieval
Creating recommendation engines
Optimizing search latency
Scaling to millions of vectors
Combining semantic and keyword search

Core Concepts

1. Distance Metrics

| Metric | Formula | Best For | | ------------------ | ------------------ | --------------------- | --- | -------------- | | Cosine | 1 - (A·B)/(‖A‖‖B‖) | Normalized embeddings | | Euclidean (L2) | √Σ(a-b)² | Raw embeddings | | Dot Product | A·B | Magnitude matters | | Manhattan (L1) | Σ | a-b | | Sparse vectors |

2. Index Types

┌─────────────────────────────────────────────────┐
│                 Index Types                      │
├─────────────┬───────────────┬───────────────────┤
│    Flat     │     HNSW      │    IVF+PQ         │
│ (Exact)     │ (Graph-based) │ (Quantized)       │
├─────────────┼───────────────┼───────────────────┤
│ O(n) search │ O(log n)      │ O(√n)             │
│ 100% recall │ ~95-99%       │ ~90-95%           │
│ Small data  │ Medium-Large  │ Very Large        │
└─────────────┴───────────────┴───────────────────┘

Templates and detailed worked examples

Full template library and detailed worked examples live in references/details.md. Read that file when you need the concrete templates.

Best Practices

Do's

Use appropriate index - HNSW for most cases
Tune parameters - ef_search, nprobe for recall/speed
Implement hybrid search - Combine with keyword search
Monitor recall - Measure search quality
Pre-filter when possible - Reduce search space

Don'ts

Don't skip evaluation - Measure before optimizing
Don't over-index - Start with flat, scale up
Don't ignore latency - P99 matters for UX
Don't forget costs - Vector storage adds up