waf-assessment
Well-Architected Framework Assessment Skill
Evaluate Azure architectures against Microsoft's Well-Architected Framework (WAF) five pillars to identify strengths, risks, and improvement opportunities.
When to Use
- Assess existing or proposed Azure architectures
- Validate designs meet WAF best practices
- Identify architectural risks and gaps
- Provide scored assessment with recommendations
- Review before production deployment
Five Pillars Overview
1. Reliability
Ability of the system to recover from failures and continue functioning.
- Focus: Availability, resiliency, disaster recovery, health monitoring
2. Security
Protecting applications and data from threats.
- Focus: Identity, network security, encryption, secrets management
3. Cost Optimization
Managing costs to maximize value delivered.
- Focus: Right-sizing, reserved instances, monitoring, waste elimination
4. Operational Excellence
Operations processes that keep a system running in production.
- Focus: IaC, CI/CD, monitoring, incident response, automation
5. Performance Efficiency
Ability of a system to adapt to changes in load.
- Focus: Scaling, caching, CDN, resource selection, optimization
Assessment Process
Step 1: Analyze Architecture
Review the architecture for each pillar:
Reliability Checklist:
- Availability Zones configured?
- Multi-region deployment for critical workloads?
- Health checks and monitoring configured?
- Auto-healing and circuit breakers implemented?
- Backup strategy defined with RPO/RTO?
- Disaster recovery plan documented?
Security Checklist:
- Managed identities used (no credentials in code)?
- Private endpoints for PaaS services?
- HTTPS only with TLS 1.2+?
- Network security groups with least privilege?
- Key Vault for secrets management?
- Azure AD authentication and RBAC configured?
- Data encrypted at rest and in transit?
Cost Optimization Checklist:
- Resources right-sized for actual usage?
- Auto-scaling configured?
- Reserved instances considered for predictable workloads?
- Storage tiering implemented (Hot/Cool/Archive)?
- Unused resources identified?
- Cost monitoring and alerts configured?
Operational Excellence Checklist:
- Infrastructure as Code (Bicep/Terraform)?
- CI/CD pipelines implemented?
- Application Insights for telemetry?
- Centralized logging (Log Analytics)?
- Alerts configured for critical scenarios?
- Deployment automation and rollback?
Performance Efficiency Checklist:
- CDN for static content?
- Caching strategy (Redis, CDN)?
- Asynchronous processing for long operations?
- Appropriate compute SKUs selected?
- Auto-scaling rules defined?
- Performance testing completed?
Step 2: Score Each Pillar
Use 0-100 scoring system:
Scoring Criteria:
- 80-100 (Excellent): Meets all best practices, production-ready
- 60-79 (Good): Meets most practices, minor gaps
- 40-59 (Fair): Some practices missing, moderate risk
- 20-39 (Poor): Many gaps, significant improvements needed
- 0-19 (Critical): Major gaps, not production-ready
Step 3: Provide Recommendations
For each identified gap:
- Finding: What's missing or problematic
- Risk: Impact if not addressed
- Recommendation: Specific action to take
- Priority: Critical / High / Medium / Low
- Effort: Hours or days to implement
Assessment Output Format
# Well-Architected Framework Assessment
**Architecture**: [Name]
**Assessment Date**: [Date]
**Overall Score**: [Average of 5 pillars]/100
## Executive Summary
[2-3 sentences on overall health, key strengths, top risks]
## Pillar Scores
| Pillar | Score | Status |
|--------|-------|--------|
| Reliability | 75/100 | 🟢 Good |
| Security | 65/100 | 🟡 Fair |
| Cost Optimization | 80/100 | 🟢 Good |
| Operational Excellence | 70/100 | 🟡 Fair |
| Performance Efficiency | 85/100 | 🟢 Excellent |
| **Overall** | **75/100** | **🟢 Good** |
---
## 1. Reliability (75/100) - 🟢 Good
### Strengths
Availability Zones configured for App Service and Azure SQL
Health checks implemented with automatic failover
Backup strategy defined (RPO: 1 hour, RTO: 4 hours)
### Gaps & Recommendations
#### Finding #1: No Multi-Region Deployment
**Risk**: Regional outage causes complete service unavailability
**Recommendation**: Implement active-passive multi-region with Azure Front Door
**Priority**: High
**Effort**: 3-5 days
**Implementation**: Deploy secondary region (West US), configure Azure Front Door with priority routing
#### Finding #2: Missing Circuit Breaker Pattern
**Risk**: Cascading failures when dependencies are degraded
**Recommendation**: Implement circuit breaker using Polly library
**Priority**: Medium
**Effort**: 1-2 days
---
## 2. Security (65/100) - 🟡 Fair
### Strengths
Azure AD authentication configured
HTTPS enforced with TLS 1.2
Key Vault used for connection strings
### Gaps & Recommendations
#### Finding #1: Service Principal Used Instead of Managed Identity
**Risk**: Credential rotation required, potential secret exposure
**Recommendation**: Replace service principal with system-assigned managed identity
**Priority**: Critical
**Effort**: 4 hours
**Implementation**:
1. Enable managed identity on App Service
2. Grant RBAC permissions to SQL and Key Vault
3. Remove service principal credentials
#### Finding #2: No Private Endpoints
**Risk**: PaaS services exposed to public internet
**Recommendation**: Implement private endpoints for SQL, Storage, Key Vault
**Priority**: High
**Effort**: 1 day
---
## 3-5. [Remaining Pillars Follow Same Structure]
---
## Priority Roadmap
### Critical (Fix Immediately)
1. Replace service principal with managed identity
2. Implement private endpoints for PaaS services
### High (Next 30 Days)
3. Multi-region deployment (active-passive)
4. Infrastructure as Code implementation
5. Implement comprehensive alerting
### Medium (Next 90 Days)
6. Circuit breaker pattern
7. Reserved instances for predictable workloads
8. Performance testing automation
### Low (Future Enhancements)
9. Chaos engineering tests
10. Additional caching layers
---
## Cost Impact Summary
- **Savings Opportunities**: ~$480/month (right-sizing, reserved instances)
- **Security Enhancements**: +$200/month (private endpoints)
- **Multi-Region**: +$850/month (passive region infrastructure)
- **Net Impact**: +$570/month for significantly improved resilience and security
---
## Conclusion
[Summary of assessment with key takeaways and prioritized next steps]
Tips for Effective Assessments
Be Specific: Reference exact resources and configurations Quantify Risk: Use concrete examples of potential impact Actionable Recommendations: Provide implementation steps, not just principles Prioritize Ruthlessly: Help teams focus on what matters most Show Business Impact: Connect technical gaps to business risks Include Quick Wins: Balance strategic improvements with fast fixes Cost-Aware: Show ROI for recommendations (cost vs benefit)
Avoid: Generic advice, overwhelming lists, missing priorities, theoretical recommendations
More from thomast1906/github-copilot-agent-skills
drawio-mcp-diagramming
Create and edit architecture diagrams using Draw.io MCP (`drawio/create_diagram`) with reliable Azure and AWS icon rendering guidance and troubleshooting. Supports Azure2 and AWS4 icon libraries. Requires Python 3 and internet access to refresh icon catalogs (periodic, not per-run).
13api-security-review
Reviews Azure API Management configurations for security vulnerabilities, OWASP API Security Top 10 compliance, VNet Internal mode validation, Private Link verification, and Azure Security Benchmark alignment. Use when performing security audits, pre-deployment validation, or compliance reviews.
13architecture-design
Design Azure cloud architectures from requirements and generate High-Level Design (HLD) documentation with service selection, patterns, cost estimates, and WAF alignment. Use this when asked to design or architect Azure solutions.
13azure-apim-architecture
Analyzes and explains Azure API Management architecture decisions for enterprise API marketplace implementations using VNet Internal mode, Front Door, hybrid authentication, and multi-environment strategies. Use when discussing APIM component selection, network topology, cost optimization, or comparing alternatives like workspaces vs instances, VNet Internal vs External mode, or Front Door vs Application Gateway.
11cost-optimization
Analyze Azure architectures for cost optimization opportunities, provide savings recommendations, and calculate ROI for improvements.
11apiops-deployment
Guides deployment of Azure API Management infrastructure using Infrastructure as Code (Bicep/Terraform), CI/CD pipelines (GitHub Actions/Azure DevOps), and APIOps workflows. Use when deploying APIM, creating pipelines, or implementing dev→test→prod promotion strategies.
10