Skip to main content
AI/MLjeremylongshore

langchain-multi-env-setup

"Build reliable dev / staging / prod isolation for LangChain 1.0 services\

Stars
2,267
Source
jeremylongshore/claude-code-plugins-plus-skills
Updated
2026-05-31
Slug
jeremylongshore--claude-code-plugins-plus-skills--langchain-multi-env-setup
View on GitHubRaw SKILL.md

// install — copy + paste into any project

mkdir -p .claude/skills && curl -fsSL https://raw.githubusercontent.com/jeremylongshore/claude-code-plugins-plus-skills/HEAD/plugins/saas-packs/langchain-py-pack/skills/langchain-multi-env-setup/SKILL.md -o .claude/skills/langchain-multi-env-setup.md

Drops the SKILL.md into .claude/skills/langchain-multi-env-setup.md. Works with Claude Code, Cursor, and any agent that loads SKILL.md files from .claude/skills/.

LangChain Multi-Env Setup (Python)

Overview

A team ships a LangChain 1.0 service to staging with python-dotenv loading .env.staging into os.environ. Security audits — docker exec STAGING-POD env prints ANTHROPIC_API_KEY=sk-ant-api03-... in plain text. Anyone with kubectl exec, any sidecar, any core dump, any error tracker that auto-captures process env sees the key. This is pain P37: secrets loaded from .env in production containers leak via env.

A second failure chains. A developer runs the staging deploy from a shell where LANGCHAIN_ENV=production was set hours earlier. The loader picks the prod .env, staging answers with a prompt commit tuned only for the prod model tier, latency doubles. Two root causes: no type-safe env gate, no startup validation that would have caught the mismatched model id.

Both are one refactor:

# BAD — dotenv populates os.environ; any process with container access sees it
from dotenv import load_dotenv
load_dotenv(".env.production")
api_key = os.environ["ANTHROPIC_API_KEY"]  # P37: leaks via `docker exec env`

# GOOD — SecretStr in a validated Settings object, pulled from Secret Manager
from pydantic import SecretStr
from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    env: Literal["dev", "staging", "prod"]
    anthropic_api_key: SecretStr

settings = build_settings()  # pulls from GCP Secret Manager in prod
api_key = settings.anthropic_api_key.get_secret_value()
# repr(settings) prints `SecretStr('**********')` — safe to log

This skill owns the per-env config plumbingSettings skeleton, Secret Manager integration, per-env pinning, startup smoke test. It does not own the full secrets lifecycle (rotation, revocation, scope) — that belongs to langchain-security-basics.

Pin: langchain-core 1.0.x, langchain-anthropic 1.0.x, pydantic >= 2.5, pydantic-settings >= 2.1. Pain anchors: P37 (primary), P20 (checkpointer schema — cross-ref langchain-langgraph-checkpointing).

Two numbers: smoke test < 10 seconds; env-var count ~15-30 (more than 30 means Settings is absorbing feature flags and should split).

Prerequisites

  • Python 3.10+ (3.11+ recommended for Literal and StrEnum ergonomics)
  • langchain-core >= 1.0, < 2.0
  • pydantic >= 2.5, pydantic-settings >= 2.1
  • One secret backend: GCP Secret Manager (google-cloud-secret-manager), AWS Secrets Manager (boto3), or HashiCorp Vault (hvac)
  • Completed langchain-sdk-patterns — the Settings object is injected into the chain factories from that skill

Instructions

Run these six steps in order — each adds one invariant the next step depends on:

  1. Define a Settings class with SecretStr keys, Literal env, and fail-fast validation.
  2. Add a per-env loader — file in dev, env vars in staging, Secret Manager in prod.
  3. Use the cloud Secret Manager client to pull keys into memory only.
  4. Pin model_id, prompt_commit_hash, and vector_index_name per env.
  5. Configure the checkpointer per env — memory in dev, Postgres elsewhere.
  6. Run a startup smoke test under 10 seconds before the HTTP server binds.

Step 1 — Create a Settings class with SecretStr and fail-fast validation

from typing import Literal
from pydantic import SecretStr, HttpUrl, Field, ValidationError
from pydantic_settings import BaseSettings, SettingsConfigDict

class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=None,              # see Step 2 — loader picks the file
        env_file_encoding="utf-8",
        case_sensitive=False,
        extra="forbid",             # reject unknown env vars — typo detection
    )

    # --- env switch (drives everything else) ---
    env: Literal["dev", "staging", "prod"] = Field(..., alias="LANGCHAIN_ENV")

    # --- secrets (always SecretStr — never str) ---
    anthropic_api_key: SecretStr = Field(..., alias="ANTHROPIC_API_KEY")
    openai_api_key: SecretStr = Field(..., alias="OPENAI_API_KEY")
    langsmith_api_key: SecretStr = Field(..., alias="LANGSMITH_API_KEY")

    # --- per-env pinning (see Step 4) ---
    model_id: str = Field(..., alias="LANGCHAIN_MODEL_ID")
    prompt_commit_hash: str = Field(..., alias="LANGCHAIN_PROMPT_COMMIT")
    vector_index_name: str = Field(..., alias="LANGCHAIN_VECTOR_INDEX")

    # --- endpoints (validated URLs — typo caught at startup) ---
    checkpointer_url: HttpUrl | None = Field(None, alias="LANGCHAIN_CHECKPOINTER_URL")
    otel_endpoint: HttpUrl = Field(..., alias="OTEL_EXPORTER_OTLP_ENDPOINT")

    # --- budget guards (per-env) ---
    max_cost_usd_per_day: float = Field(10.0, alias="LANGCHAIN_DAILY_BUDGET_USD")
    max_rpm: int = Field(60, alias="LANGCHAIN_MAX_RPM")

SecretStr masks repr(settings) to SecretStr('**********') — a routine logger.info(settings) cannot leak the key. The only way to read plaintext is .get_secret_value(), which greps like a sore thumb in review. extra="forbid" catches typos (LANGCHIN_MODEL_ID) at import time. HttpUrl rejects http:/otel:4318 before the exporter wastes 60s on DNS.

See Settings Skeleton for the full class.

Step 2 — Per-env config loading (file OR Secret Manager, never both)

import os
from pathlib import Path

def build_settings() -> Settings:
    env = os.environ.get("LANGCHAIN_ENV", "dev")

    if env == "dev":
        # Local dev: .env.dev file, values checked into 1Password not git
        return Settings(_env_file=Path(".env.dev"))

    if env == "staging":
        # CI / staging: env vars injected by the orchestrator
        # (GitHub Actions secrets, k8s envFrom: secretRef, etc.)
        return Settings()  # reads os.environ directly

    if env == "prod":
        # Prod: pull from Secret Manager into memory ONLY
        values = pull_from_secret_manager()
        return Settings(**values)

    raise ValueError(f"unknown LANGCHAIN_ENV: {env!r}")

Three loaders, one class. Dev touches a file on disk. Staging inherits env vars from the orchestrator — envFrom: secretRef is readable via docker exec env, but the blast radius is bounded and rotation is weekly.

Prod is the P37 fix: pull_from_secret_manager() builds a dict and passes kwargs to Settings(...). Values land in the instance attribute and never touch os.environ. A subprocess will not inherit them.

Step 3 — Secret Manager pull (GCP example; AWS / Vault in reference)

from google.cloud import secretmanager

def pull_from_secret_manager() -> dict[str, str]:
    client = secretmanager.SecretManagerServiceClient()
    project = os.environ["GCP_PROJECT_ID"]
    secret_names = ["ANTHROPIC_API_KEY", "OPENAI_API_KEY", "LANGSMITH_API_KEY"]
    out: dict[str, str] = {}
    for name in secret_names:
        resource = f"projects/{project}/secrets/{name}/versions/latest"
        response = client.access_secret_version(request={"name": resource})
        out[name] = response.payload.data.decode("utf-8")
    # Non-secret passthrough (model id, prompt hash, endpoints)
    for key in ["LANGCHAIN_ENV", "LANGCHAIN_MODEL_ID", "LANGCHAIN_PROMPT_COMMIT",
                "LANGCHAIN_VECTOR_INDEX", "LANGCHAIN_CHECKPOINTER_URL",
                "OTEL_EXPORTER_OTLP_ENDPOINT"]:
        if key in os.environ:
            out[key] = os.environ[key]
    return out

No os.environ[k] = v line. The dict goes straight into Settings(**values). Workload-identity IAM handles auth; no static key on disk. For AWS / Vault see Secret Manager Integration.

Step 4 — Per-env model and prompt pinning

Dev, staging, and prod run different model ids and different prompt commit hashes. Pinning happens at env-var level so app code is env-agnostic (see the Env Matrix below for values). One function reads settings.prompt_commit_hash and pulls from LangSmith (cross-ref langchain-prompt-engineering):

from langsmith import Client
ls = Client(api_key=settings.langsmith_api_key.get_secret_value())

def get_prompt(settings: Settings) -> ChatPromptTemplate:
    return ls.pull_prompt(f"triage-prompt:{settings.prompt_commit_hash}")

Prevents: staging loading a prod prompt commit. Pinning per env makes promotion explicit — dev → staging → prod moves one hash at a time. See Per-Env Pinning.

Step 5 — Per-env checkpointer selection

Checkpointer choice is per-env too:

from langgraph.checkpoint.memory import MemorySaver
from langgraph.checkpoint.postgres import PostgresSaver

def build_checkpointer(settings: Settings):
    if settings.env == "dev":
        return MemorySaver()          # ephemeral, resets on restart
    # staging + prod: Postgres with env-isolated schema
    # cross-ref langchain-langgraph-checkpointing (P20) for schema migration
    return PostgresSaver.from_conn_string(
        str(settings.checkpointer_url)
    )

Dev uses MemorySaver — no infra dependency, no state between runs. Staging and prod use PostgresSaver against separate databases (or separate schemas). Never share a checkpointer DB between envs; P20 explains — schema migrations on a version bump corrupt cross-env threads.

Step 6 — Startup smoke test (< 10 seconds budget)

import time
from anthropic import Anthropic

def validate_integrations(settings: Settings) -> None:
    t0 = time.monotonic()

    # 1. Model reachable (1-token ping ~ $0.00001)
    anthropic = Anthropic(api_key=settings.anthropic_api_key.get_secret_value())
    anthropic.messages.create(
        model=settings.model_id,
        max_tokens=1,
        messages=[{"role": "user", "content": "hi"}],
    )

    # 2. Checkpointer reachable
    if settings.env != "dev":
        checkpointer = build_checkpointer(settings)
        checkpointer.setup()  # runs SELECT 1 + schema check

    # 3. Vector store reachable (see langchain-embeddings-search)
    # ... describe_index call here ...

    # 4. Observability endpoint reachable (OTLP HTTP health)
    # ... requests.get(f"{settings.otel_endpoint}/health", timeout=2) ...

    elapsed = time.monotonic() - t0
    if elapsed > 10.0:
        raise RuntimeError(
            f"startup smoke test took {elapsed:.1f}s (budget 10s)"
        )

Call validate_integrations(settings) before the HTTP server binds. Failure aborts the deploy — the readiness probe never goes green, the rollout halts, the bad version takes no traffic. Budget: 10 seconds. Past 10s an integration is degraded — fail loudly rather than ship a 30s cold start. See Startup Smoke Test.

Output

  • Settings class on pydantic-settings with SecretStr for keys, Literal env, HttpUrl endpoints, extra="forbid"
  • Env-specific loader (file → dev; env vars → staging; Secret Manager → prod); values land in Settings only, never os.environ
  • Cloud Secret Manager integration (GCP / AWS / Vault) with IAM-bound auth; no static keys on disk
  • Per-env pinning for model_id, prompt_commit_hash, vector_index_name, checkpointer_url
  • Per-env checkpointer (MemorySaver dev, PostgresSaver on isolated DBs staging/prod)
  • Startup smoke test — model / vector / checkpointer / observability under 10-second budget

Env Matrix

Dimension dev staging prod
Secret backend .env.dev file (git-ignored) orchestrator env vars cloud Secret Manager, memory only
os.environ holds keys yes (local) yes (sidecar visible) no (P37 fix)
model_id claude-haiku-4-6 claude-sonnet-4-6 claude-sonnet-4-6
prompt_commit_hash WIP canary stable (1 week old)
temperature 0.7 0.2 0.2
Checkpointer MemorySaver PostgresSaver (staging DB) PostgresSaver (prod DB)
Vector index dev-index staging-index prod-index
OTEL sample rate 1.0 1.0 0.1
RPM limit 10 60 provider tier
Daily budget $1 $10 $500-$5000
Smoke probes model model + checkpointer + OTEL all four

Error Handling

Error Cause Fix
docker exec POD env shows ANTHROPIC_API_KEY=... in prod (P37) dotenv / plain env injection in prod Pull from Secret Manager into Settings(**values); never write to os.environ
Staging answers with prod prompts / wrong model Loader defaulted or picked stale LANGCHAIN_ENV Literal["dev","staging","prod"] on env; raise on unknown; no default
ValidationError: extra fields forbidden at startup Typo (LANGCHIN_MODEL_ID) Fix the typo — extra="forbid" working as intended
Startup takes 30s before first request Serialized probes or degraded integration Enforce 10s budget; parallelize probes; fail the deploy
repr(settings) in a log leaks the API key Plain str used, not SecretStr Change field to SecretStr; repr masks to '**********'
Prod silently using MemorySaver build_checkpointer defaulted when checkpointer_url was None Require checkpointer_url in staging/prod via a model validator
Secret Manager auth fails in CI SA not bound; google.auth fell back to ADC Bind SA with roles/secretmanager.secretAccessor
Prompt hash rolled forward in staging without dev validation Promotion skipped the dev gate Enforce dev → staging → prod order in CI (see per-env pinning ref)

Examples

Graduating a .env-in-dev service to prod

Start: a single .env committed (or leaked via docker exec env). End: Settings class, three loaders, Secret Manager in prod, smoke test under 10s. Three PRs — (1) introduce Settings without changing loader behavior, (2) add SecretStr and migrate call sites to .get_secret_value(), (3) swap prod to Secret Manager and remove the prod .env from the image. See Settings Skeleton and Secret Manager Integration.

Wrong-env prompt loaded in staging — postmortem

Staging inherited LANGCHAIN_ENV=production from a stale shell. The Literal["dev","staging","prod"] field rejects production; CI promotion sets LANGCHAIN_ENV explicitly; direnv pins it per-project. See Per-Env Pinning.

Smoke test blocked a bad model id

A prod deploy went out with LANGCHAIN_MODEL_ID=claude-sonnet-4-7 (not yet rolled out). The 1-token ping failed with model not found, validate_integrations raised, the container crash-looped, the rollout halted, the previous version kept taking traffic. Zero user impact; failure budget stayed under 3s. See Startup Smoke Test.

Resources