| categories:SRE
Production readiness checklist
This is the checklist I use for assessing service production readiness. It is sorted by importance, from the most important items to “nice to have” things.
- CI/CD (Routine tasks automated)
- Basic logs with verbosity levels per subsystem
- Application metrics
- Dashboard
- System metrics
- Health-checks
- Alerting and runbooks
- Documented dependencies
- Backups
- SLO
- Horizontal autoscaling
- Graceful degradation in case of a dependency failure
- Resource constraints
- Rate-limiting
- Feature-flags support for non-critical functions
- Performance test
- On-call rotation
- Multitenancy support (Testing in production with different request contexts) - https://eng.uber.com/multitenancy-microservice-architecture/