Prompt Engineering Patterns
Master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability.
When to Use This Skill
- Designing complex prompts for production LLM applications
- Optimizing prompt performance and consistency
- Implementing structured reasoning patterns (chain-of-thought, tree-of-thought)
- Building few-shot learning systems with dynamic example selection
- Creating reusable prompt templates with variable interpolation
- Debugging and refining prompts that produce inconsistent outputs
- Implementing system prompts for specialized AI assistants
- Using structured outputs (JSON mode) for reliable parsing
Core Capabilities
1. Few-Shot Learning
- Example selection strategies (semantic similarity, diversity sampling)
- Balancing example count with context window constraints
- Constructing effective demonstrations with input-output pairs
- Dynamic example retrieval from knowledge bases
- Handling edge cases through strategic example selection
2. Chain-of-Thought Prompting
- Step-by-step reasoning elicitation
- Zero-shot CoT with "Let's think step by step"
- Few-shot CoT with reasoning traces
- Self-consistency techniques (sampling multiple reasoning paths)
- Verification and validation steps
3. Structured Outputs
- JSON mode for reliable parsing
- Pydantic schema enforcement
- Type-safe response handling
- Error handling for malformed outputs
4. Prompt Optimization
- Iterative refinement workflows
- A/B testing prompt variations
- Measuring prompt performance metrics (accuracy, consistency, latency)
- Reducing token usage while maintaining quality
- Handling edge cases and failure modes
5. Template Systems
- Variable interpolation and formatting
- Conditional prompt sections
- Multi-turn conversation templates
- Role-based prompt composition
- Modular prompt components
6. System Prompt Design
- Setting model behavior and constraints
- Defining output formats and structure
- Establishing role and expertise
- Safety guidelines and content policies
- Context setting and background information
Quick Start
from langchain_anthropic import ChatAnthropic
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field
# Define structured output schema
class SQLQuery(BaseModel):
query: str = Field(description="The SQL query")
explanation: str = Field(description="Brief explanation of what the query does")
tables_used: list[str] = Field(description="List of tables referenced")
# Initialize model with structured output
llm = ChatAnthropic(model="claude-sonnet-4-6")
structured_llm = llm.with_structured_output(SQLQuery)
# Create prompt template
prompt = ChatPromptTemplate.from_messages([
("system", """You are an expert SQL developer. Generate efficient, secure SQL queries.
Always use parameterized queries to prevent SQL injection.
Explain your reasoning briefly."""),
("user", "Convert this to SQL: {query}")
])
# Create chain
chain = prompt | structured_llm
# Use
result = await chain.ainvoke({
"query": "Find all users who registered in the last 30 days"
})
print(result.query)
print(result.explanation)
Detailed patterns and worked examples
Detailed pattern documentation lives in references/details.md. Read that file when the navigation tier above is insufficient.
Best Practices
- Be Specific: Vague prompts produce inconsistent results
- Show, Don't Tell: Examples are more effective than descriptions
- Use Structured Outputs: Enforce schemas with Pydantic for reliability
- Test Extensively: Evaluate on diverse, representative inputs
- Iterate Rapidly: Small changes can have large impacts
- Monitor Performance: Track metrics in production
- Version Control: Treat prompts as code with proper versioning
- Document Intent: Explain why prompts are structured as they are
Common Pitfalls
- Over-engineering: Starting with complex prompts before trying simple ones
- Example pollution: Using examples that don't match the target task
- Context overflow: Exceeding token limits with excessive examples
- Ambiguous instructions: Leaving room for multiple interpretations
- Ignoring edge cases: Not testing on unusual or boundary inputs
- No error handling: Assuming outputs will always be well-formed
- Hardcoded values: Not parameterizing prompts for reuse
Success Metrics
Track these KPIs for your prompts:
- Accuracy: Correctness of outputs
- Consistency: Reproducibility across similar inputs
- Latency: Response time (P50, P95, P99)
- Token Usage: Average tokens per request
- Success Rate: Percentage of valid, parseable outputs
- User Satisfaction: Ratings and feedback