CoreWeave Production Checklist
Inference Services
- GPU type and count validated for model size
- Autoscaling configured (KServe or HPA)
- Health and readiness probes set
- Resource requests AND limits specified
- Node affinity targeting correct GPU class
-
minReplicas >= 1for production (no cold starts)
Storage
- Model weights in PVC (not downloaded at startup)
- Checkpoints saved to persistent storage
- Storage class appropriate (SSD for inference, HDD for archival)
Security
- Secrets for model tokens and registry access
- Network policies applied
- Container images from trusted registries
Monitoring
- GPU utilization metrics collected
- Inference latency and throughput tracked
- Alert on pod restarts and OOM events
- Log aggregation configured
Rollback
kubectl rollout undo deployment/my-inference
kubectl rollout status deployment/my-inference
Resources
Next Steps
For upgrades, see coreweave-upgrade-migration.