Skip to main content
AI/MLjeremylongshore

perplexity-migration-deep-dive

'Migrate to Perplexity Sonar from other search/LLM APIs using the strangler

Stars
2,267
Source
jeremylongshore/claude-code-plugins-plus-skills
Updated
2026-05-31
Slug
jeremylongshore--claude-code-plugins-plus-skills--perplexity-migration-deep-dive
View on GitHubRaw SKILL.md

// install — copy + paste into any project

mkdir -p .claude/skills && curl -fsSL https://raw.githubusercontent.com/jeremylongshore/claude-code-plugins-plus-skills/HEAD/plugins/saas-packs/perplexity-pack/skills/perplexity-migration-deep-dive/SKILL.md -o .claude/skills/perplexity-migration-deep-dive.md

Drops the SKILL.md into .claude/skills/perplexity-migration-deep-dive.md. Works with Claude Code, Cursor, and any agent that loads SKILL.md files from .claude/skills/.

Perplexity Migration Deep Dive

Current State

!npm list openai 2>/dev/null | grep openai || echo 'N/A' !grep -rn "google.*search\|bing.*api\|serpapi\|pplx-7b\|pplx-70b" --include="*.ts" --include="*.py" . 2>/dev/null | head -5 || echo 'No legacy search APIs found'

Overview

Migrate from traditional search APIs (Google Custom Search, Bing, SerpAPI) or legacy LLMs to Perplexity Sonar. Key advantage: Perplexity combines search + LLM summarization in a single API call, replacing a multi-step pipeline.

Migration Comparison

Feature Google CSE / Bing Perplexity Sonar
Returns Raw search results (links + snippets) Synthesized answer + citations
Answer generation Requires separate LLM call Built-in
Citation handling Manual extraction Automatic citations array
Cost structure Per-search ($5/1K queries) Per-token + per-request
Recency filter Date range parameters search_recency_filter
Domain filter Site restriction search_domain_filter

Instructions

Step 1: Assess Current Integration

set -euo pipefail
# Find existing search API usage
grep -rn "googleapis.*customsearch\|bing.*search\|serpapi\|serper\|tavily" \
  --include="*.ts" --include="*.py" --include="*.js" \
  . 2>/dev/null || echo "No search APIs found"

# Count integration points
grep -rln "search.*api\|customsearch\|bing.*web" \
  --include="*.ts" --include="*.py" --include="*.js" \
  . 2>/dev/null | wc -l

Step 2: Build Adapter Layer

// src/search/adapter.ts
export interface SearchResult {
  answer: string;
  citations: string[];
  rawResults?: Array<{ title: string; url: string; snippet: string }>;
}

export interface SearchAdapter {
  search(query: string, opts?: { recency?: string; domains?: string[] }): Promise<SearchResult>;
}

// Legacy adapter (existing Google/Bing implementation)
class GoogleSearchAdapter implements SearchAdapter {
  async search(query: string): Promise<SearchResult> {
    // Existing Google CSE code
    const results = await googleCustomSearch(query);
    return {
      answer: "", // No built-in answer generation
      citations: results.items.map((i: any) => i.link),
      rawResults: results.items.map((i: any) => ({
        title: i.title,
        url: i.link,
        snippet: i.snippet,
      })),
    };
  }
}

// New Perplexity adapter
class PerplexitySearchAdapter implements SearchAdapter {
  private client: OpenAI;

  constructor() {
    this.client = new OpenAI({
      apiKey: process.env.PERPLEXITY_API_KEY!,
      baseURL: "https://api.perplexity.ai",
    });
  }

  async search(query: string, opts?: { recency?: string; domains?: string[] }): Promise<SearchResult> {
    const response = await this.client.chat.completions.create({
      model: "sonar",
      messages: [{ role: "user", content: query }],
      ...(opts?.recency && { search_recency_filter: opts.recency }),
      ...(opts?.domains && { search_domain_filter: opts.domains }),
    } as any);

    return {
      answer: response.choices[0].message.content || "",
      citations: (response as any).citations || [],
      rawResults: (response as any).search_results || [],
    };
  }
}

Step 3: Feature Flag Traffic Split

// src/search/factory.ts
function getSearchAdapter(): SearchAdapter {
  const perplexityPercent = parseInt(process.env.PERPLEXITY_TRAFFIC_PERCENT || "0");

  if (Math.random() * 100 < perplexityPercent) {
    return new PerplexitySearchAdapter();
  }
  return new GoogleSearchAdapter();
}

// Migration schedule:
// Week 1: PERPLEXITY_TRAFFIC_PERCENT=10  (canary)
// Week 2: PERPLEXITY_TRAFFIC_PERCENT=50  (half traffic)
// Week 3: PERPLEXITY_TRAFFIC_PERCENT=100 (full migration)
// Week 4: Remove Google adapter code

Step 4: Validate Migration Quality

// Compare results between old and new adapter
async function compareSearchResults(query: string): Promise<{
  perplexity: SearchResult;
  google: SearchResult;
  citationOverlap: number;
}> {
  const [perplexity, google] = await Promise.all([
    new PerplexitySearchAdapter().search(query),
    new GoogleSearchAdapter().search(query),
  ]);

  // Check citation overlap (shared domains)
  const pplxDomains = new Set(perplexity.citations.map((u) => new URL(u).hostname));
  const googleDomains = new Set(google.citations.map((u) => new URL(u).hostname));
  const overlap = [...pplxDomains].filter((d) => googleDomains.has(d)).length;

  return {
    perplexity,
    google,
    citationOverlap: overlap / Math.max(pplxDomains.size, 1),
  };
}

Step 5: Simplify Post-Migration

// Before migration: 3-step pipeline
// 1. Google Custom Search API → raw results
// 2. Send results to LLM for summarization
// 3. Extract citations manually

// After migration: 1-step
async function search(query: string): Promise<{ answer: string; sources: string[] }> {
  const client = new OpenAI({
    apiKey: process.env.PERPLEXITY_API_KEY!,
    baseURL: "https://api.perplexity.ai",
  });

  const response = await client.chat.completions.create({
    model: "sonar",
    messages: [{ role: "user", content: query }],
  });

  return {
    answer: response.choices[0].message.content || "",
    sources: (response as any).citations || [],
  };
}

Rollback Plan

set -euo pipefail
# Instant rollback: set traffic to 0%
# kubectl set env deployment/search-app PERPLEXITY_TRAFFIC_PERCENT=0
# The adapter layer keeps both implementations live until decommissioned

Error Handling

Issue Cause Solution
Citation format differs Google returns titles, Perplexity returns URLs Normalize in adapter
No raw results Perplexity returns synthesized answer Use search_results field if available
Higher latency Perplexity does search + synthesis Expected; cache to compensate
Cost increase Perplexity uses more tokens Route simple queries to sonar, limit max_tokens

Output

  • Adapter layer abstracting search implementations
  • Feature-flagged traffic split for gradual migration
  • Quality comparison between old and new search
  • Simplified single-API architecture post-migration

Resources

Next Steps

For advanced troubleshooting, see perplexity-advanced-troubleshooting.