Skip to main content
AI/MLwshobson

embedding-strategies

Select and optimize embedding models for semantic search and RAG applications. Use when choosing embedding models, implementing chunking strategies, or optimizing embedding quality for specific domains.

Stars
36,167
Source
wshobson/agents
Updated
2026-05-29
Slug
wshobson--agents--embedding-strategies
View on GitHubRaw SKILL.md

// install — copy + paste into any project

mkdir -p .claude/skills && curl -fsSL https://raw.githubusercontent.com/wshobson/agents/HEAD/plugins/llm-application-dev/skills/embedding-strategies/SKILL.md -o .claude/skills/embedding-strategies.md

Drops the SKILL.md into .claude/skills/embedding-strategies.md. Works with Claude Code, Cursor, and any agent that loads SKILL.md files from .claude/skills/.

Embedding Strategies

Guide to selecting and optimizing embedding models for vector search applications.

When to Use This Skill

  • Choosing embedding models for RAG
  • Optimizing chunking strategies
  • Fine-tuning embeddings for domains
  • Comparing embedding model performance
  • Reducing embedding dimensions
  • Handling multilingual content

Core Concepts

1. Embedding Model Comparison (2026)

Model Dimensions Max Tokens Best For
voyage-3-large 1024 32000 Claude apps (Anthropic recommended)
voyage-3 1024 32000 Claude apps, cost-effective
voyage-code-3 1024 32000 Code search
voyage-finance-2 1024 32000 Financial documents
voyage-law-2 1024 32000 Legal documents
text-embedding-3-large 3072 8191 OpenAI apps, high accuracy
text-embedding-3-small 1536 8191 OpenAI apps, cost-effective
bge-large-en-v1.5 1024 512 Open source, local deployment
all-MiniLM-L6-v2 384 256 Fast, lightweight
multilingual-e5-large 1024 512 Multi-language

2. Embedding Pipeline

Document → Chunking → Preprocessing → Embedding Model → Vector
                ↓
        [Overlap, Size]  [Clean, Normalize]  [API/Local]

Templates and detailed worked examples

Full template library and detailed worked examples live in references/details.md. Read that file when you need the concrete templates.

Best Practices

Do's

  • Match model to use case: Code vs prose vs multilingual
  • Chunk thoughtfully: Preserve semantic boundaries
  • Normalize embeddings: For cosine similarity search
  • Batch requests: More efficient than one-by-one
  • Cache embeddings: Avoid recomputing for static content
  • Use Voyage AI for Claude apps: Recommended by Anthropic

Don'ts

  • Don't ignore token limits: Truncation loses information
  • Don't mix embedding models: Incompatible vector spaces
  • Don't skip preprocessing: Garbage in, garbage out
  • Don't over-chunk: Lose important context
  • Don't forget metadata: Essential for filtering and debugging