chaos-engineering-resilience
Chaos Engineering & Resilience Testing
<default_to_action> When testing system resilience or injecting failures:
- DEFINE steady state (normal metrics: error rate, latency, throughput)
- HYPOTHESIZE system continues in steady state during failure
- INJECT real-world failures (network, instance, disk, CPU)
- OBSERVE and measure deviation from steady state
- FIX weaknesses discovered, document runbooks, repeat
Quick Chaos Steps:
- Start small: Dev → Staging → 1% prod → gradual rollout
- Define clear rollback triggers (error_rate > 5%)
- Measure blast radius, never exceed planned scope
- Document findings → runbooks → improved resilience
Critical Success Factors:
- Controlled experiments with automatic rollback
- Steady state must be measurable
More from proffesor-for-testing/sentinel-api-testing
mobile-testing
Comprehensive mobile testing for iOS and Android platforms including gestures, sensors, permissions, device fragmentation, and performance. Use when testing native apps, hybrid apps, or mobile web, ensuring quality across 1000+ device variants.
4technical-writing
Write clear, engaging technical content from real experience. Use when writing blog posts, documentation, tutorials, or technical articles.
4flow-nexus-swarm
Cloud-based AI swarm deployment and event-driven workflow automation with Flow Nexus platform
3shift-right-testing
Testing in production with feature flags, canary deployments, synthetic monitoring, and chaos engineering. Use when implementing production observability or progressive delivery.
3security-testing
Test for security vulnerabilities using OWASP principles. Use when conducting security audits, testing auth, or implementing security practices.
3contract-testing
Consumer-driven contract testing for microservices using Pact, schema validation, API versioning, and backward compatibility testing. Use when testing API contracts or coordinating distributed teams.
3