AI/MLjeremylongshore

coreweave-prod-checklist

'Production readiness checklist for CoreWeave GPU workloads.

Stars: 2,267
Source: jeremylongshore/claude-code-plugins-plus-skills
Updated: 2026-05-31
Slug: jeremylongshore--claude-code-plugins-plus-skills--coreweave-prod-checklist

// install — copy + paste into any project

mkdir -p .claude/skills && curl -fsSL https://raw.githubusercontent.com/jeremylongshore/claude-code-plugins-plus-skills/HEAD/plugins/saas-packs/coreweave-pack/skills/coreweave-prod-checklist/SKILL.md -o .claude/skills/coreweave-prod-checklist.md

Drops the SKILL.md into .claude/skills/coreweave-prod-checklist.md. Works with Claude Code, Cursor, and any agent that loads SKILL.md files from .claude/skills/.

CoreWeave Production Checklist

Inference Services

GPU type and count validated for model size
Autoscaling configured (KServe or HPA)
Health and readiness probes set
Resource requests AND limits specified
Node affinity targeting correct GPU class
minReplicas >= 1 for production (no cold starts)

Storage

Model weights in PVC (not downloaded at startup)
Checkpoints saved to persistent storage
Storage class appropriate (SSD for inference, HDD for archival)

Security

Secrets for model tokens and registry access
Network policies applied
Container images from trusted registries

Monitoring

GPU utilization metrics collected
Inference latency and throughput tracked
Alert on pod restarts and OOM events
Log aggregation configured

Rollback

kubectl rollout undo deployment/my-inference
kubectl rollout status deployment/my-inference

Resources

CoreWeave CKS

Next Steps

For upgrades, see coreweave-upgrade-migration.