devops
DevOps
Core Capabilities
Provides expert guidance covering the entire software delivery lifecycle:
- CI/CD Pipeline Design - Automated build, test, and deployment workflows
- Infrastructure as Code - Cloud resource provisioning with Terraform, CloudFormation, Bicep
- Container Orchestration - Docker and Kubernetes deployment patterns
- Deployment Strategies - Blue-green, canary, and rolling deployments
- Monitoring & Observability - Metrics, logging, alerting with Prometheus, Grafana, ELK
- Configuration Management - Ansible, Chef, Puppet automation
- Security & Compliance - DevSecOps practices and container security
Best Practices
CI/CD
- Keep pipelines fast (< 10 minutes for feedback)
- Fail fast with quick tests first
- Use pipeline as code (version controlled)
- Implement proper secret management
- Enable artifact caching and parallelize independent jobs
Infrastructure as Code
- Use remote state with locking
- Create reusable modules and pin versions
- Always review plan before apply
- Implement proper tagging strategy
- Document resource dependencies
Container Orchestration
- Set resource requests and limits
- Implement health checks (liveness/readiness probes)
- Use pod anti-affinity for high availability
- Enable horizontal pod autoscaling
- Implement proper logging and monitoring
Deployment
- Use rolling updates with zero downtime
- Implement proper health checks and rollback capabilities
- Use canary/blue-green for critical applications
- Test thoroughly in staging environments
- Monitor post-deployment metrics
Security
- Run containers as non-root with read-only root filesystems
- Scan images for vulnerabilities regularly
- Implement network policies and secrets management
- Enable pod security standards and least privilege access
Monitoring
- Collect metrics using RED/USE methods
- Implement structured logging with meaningful alerts
- Create actionable dashboards and monitor SLIs/SLOs
- Set up distributed tracing for microservices
Detailed References
Load reference files based on specific needs:
-
CI/CD Pipeline Design: See cicd-pipeline-design.md for:
- GitHub Actions, GitLab CI, Jenkins pipeline examples
- Automated build, test, deploy workflow patterns
- Pipeline optimization and caching strategies
-
Infrastructure as Code: See infrastructure-as-code.md for:
- Terraform, CloudFormation, Bicep patterns
- AWS, GCP, Azure resource provisioning
- Module design and state management
-
Container Orchestration: See container-orchestration.md for:
- Kubernetes manifests, Helm charts, Kustomize
- Docker best practices and multi-stage builds
- Service mesh and networking patterns
-
Deployment Strategies: See deployment-strategies.md for:
- Blue-green deployment implementation
- Canary release patterns with traffic splitting
- Rolling update strategies and rollback procedures
-
Monitoring & Observability: See monitoring-and-observability.md for:
- Prometheus, Grafana setup and configuration
- ELK stack deployment and log aggregation
- Alert rules, dashboards, and SLO definitions
-
Security Best Practices: See security-best-practices.md for:
- DevSecOps pipeline integration
- Container security scanning and hardening
- Secret management and compliance validation
-
Configuration Management: See configuration-management.md for:
- Ansible playbooks, Chef recipes, Puppet manifests
- Server configuration automation patterns
- Infrastructure drift detection
-
Common Commands: See common-commands.md for:
- Kubernetes kubectl command reference
- Docker CLI operations
- Terraform and cloud provider CLI commands
-
Troubleshooting: See troubleshooting-guide.md for:
- Common issues and resolution steps
- Debugging techniques for containers and orchestration
- Performance optimization strategies
More from dauquangthanh/hanoi-rainbow
frontend-design-review
Conducts comprehensive frontend design reviews covering UI/UX design quality, design system validation, accessibility compliance, responsive design patterns, component library architecture, and visual design consistency. Evaluates design specifications, Figma/Sketch files, design tokens, interaction patterns, and user experience flows. Identifies usability issues, accessibility violations, design system deviations, and provides actionable recommendations for improvement. Produces detailed design review reports with severity-rated findings, visual examples, and implementation guidelines. Use when reviewing frontend designs, validating design systems, ensuring accessibility compliance, evaluating component libraries, assessing responsive designs, or when users mention design review, UI/UX review, Figma review, design system validation, accessibility audit, or frontend design quality.
276frontend-ui-ux-design
Creates comprehensive frontend UI/UX designs including user interfaces, design systems, component libraries, responsive layouts, and accessibility implementations. Produces wireframes, mockups, design specifications, and implementation guidelines. Use when designing user interfaces, creating design systems, building component libraries, implementing responsive designs, ensuring accessibility compliance, or when users mention UI design, UX design, interface design, design systems, user experience, or frontend design patterns.
167keycloak-administration
Provides comprehensive KeyCloak administration guidance including realm management, user/group administration, client configuration, authentication flows, identity brokering, authorization policies, security hardening, and troubleshooting. Covers SSO configuration, SAML/OIDC setup, role-based access control (RBAC), user federation (LDAP/AD), social login integration, multi-factor authentication (MFA), and high availability deployments. Use when configuring KeyCloak, setting up SSO, managing realms and clients, troubleshooting authentication issues, implementing RBAC, or when users mention "KeyCloak", "SSO", "OIDC", "SAML", "identity provider", "IAM", "authentication flow", "user federation", "realm configuration", or "access management".
165architecture-design-review
Conducts comprehensive architecture design reviews including system design validation, architecture pattern assessment, quality attributes evaluation, technology stack review, and scalability analysis. Produces detailed review reports with findings, recommendations, and risk assessments. Use when reviewing software architecture designs, validating architecture decisions, assessing system scalability, evaluating technology choices, or when users mention architecture review, design assessment, technical review, or architecture validation.
41google-cloud
Provides comprehensive Google Cloud Platform (GCP) guidance including Compute Engine, Cloud Storage, Cloud SQL, BigQuery, GKE (Google Kubernetes Engine), Cloud Functions, Cloud Run, VPC networking, load balancing, IAM, Cloud Build, infrastructure as code (Terraform, Deployment Manager), security configuration, cost optimization, and multi-region deployment. Produces infrastructure code, deployment scripts, configuration guides, and architecture designs. Use when deploying to Google Cloud, designing GCP infrastructure, migrating to GCP, configuring GCE instances, setting up Cloud Storage, managing Cloud SQL databases, working with BigQuery, deploying to GKE, or when users mention "Google Cloud", "GCP", "Compute Engine", "Cloud Storage", "BigQuery", "GKE", "Cloud Run", "Cloud Functions", "VPC", "Cloud SQL", or "Google Cloud Platform".
34bug-analysis
Analyzes software bugs including root cause identification, severity assessment, impact analysis, reproduction steps validation, and fix recommendations. Performs bug triage, categorization, duplicate detection, and regression analysis. Use when investigating bugs, analyzing crash reports, triaging issues, debugging problems, reviewing error logs, or when users mention "analyze bug", "investigate issue", "debug problem", "bug report", "crash analysis", "root cause analysis", or "fix recommendation".
31