azure-deployment-operations
Azure Deployment Operations
Battle-tested patterns for deploying and operating Azure services in production.
Scope: Inheritable skill. Covers SWA, Container Apps, App Service, security posture, rate limiting, production checklists, and multi-subscription management.
Azure Static Web Apps (SWA)
Environment Deployment
SWA defaults to preview environments. Always specify production explicitly:
# Anti-pattern: deploys to preview environment
swa deploy
# Correct: explicitly target production
swa deploy --env production
Rule: Every SWA deployment command in CI/CD and scripts MUST include --env production unless creating a preview intentionally.
Custom Domain Registration
Custom domains require a two-step process:
| Step | Action | Validates |
|---|---|---|
| 1. CNAME record | Point domain to SWA default hostname | DNS ownership |
| 2. Hostname registration | az staticwebapp hostname set |
Azure binding |
Common error: Adding CNAME but forgetting hostname registration. Both steps are required.
SWA Configuration
{
"navigationFallback": {
"rewrite": "/index.html",
"exclude": ["/api/*", "/assets/*"]
},
"routes": [
{ "route": "/api/*", "allowedRoles": ["authenticated"] }
],
"globalHeaders": {
"X-Content-Type-Options": "nosniff",
"X-Frame-Options": "DENY"
}
}
Azure Container Apps
Health Probe Configuration
Container Apps health probes have specific threshold ranges:
| Parameter | Min | Max | Recommended |
|---|---|---|---|
failureThreshold |
1 | 48 | 3-5 for liveness, 10-30 for startup |
periodSeconds |
1 | 240 | 10 for liveness, 5 for startup |
initialDelaySeconds |
0 | 60 | 0 for liveness (use startup probe instead) |
timeoutSeconds |
1 | 240 | 5 |
Pattern: Use a startup probe with high failure threshold (30) for slow-starting apps instead of a high initialDelaySeconds on the liveness probe.
Container Apps Deployment
# Update with new image
az containerapp update \
--name myapp \
--resource-group myrg \
--image myregistry.azurecr.io/myapp:v1.2.3
# Scale configuration
az containerapp update \
--name myapp \
--resource-group myrg \
--min-replicas 1 \
--max-replicas 10
Azure App Service
11-Step Deployment Pipeline
Typical App Service deployment takes ~7 minutes:
| Step | Duration | Action |
|---|---|---|
| 1 | 5s | Authenticate to Azure |
| 2 | 10s | Validate resource group exists |
| 3 | 30s | Build application |
| 4 | 15s | Run tests |
| 5 | 20s | Package artifacts |
| 6 | 10s | Upload to staging slot |
| 7 | 60s | Warm up staging slot |
| 8 | 5s | Run smoke tests on staging |
| 9 | 30s | Swap staging → production |
| 10 | 10s | Validate production health |
| 11 | 5s | Tag release in source control |
Rule: Always use staging slots for zero-downtime deployment. Direct-to-production deployments cause cold-start downtime.
Production Readiness Checklist
| Category | Requirement | Why |
|---|---|---|
| Compute | P1v3 or higher | Burstable tiers have CPU throttling |
| Networking | VNet integration | Isolate from public internet |
| Data | Private endpoints for storage/DB | No public connection strings |
| Identity | Managed identity (no connection strings) | Eliminates secret rotation |
| Monitoring | Application Insights enabled | Observability |
| Scaling | Auto-scale rules configured | Handle load spikes |
| Backup | Automated backup policy | Disaster recovery |
| SSL | Custom domain + managed certificate | Trust and security |
Security Posture Assessment
Pass/Fail Matrix
Use a matrix to track security controls across all resources:
| Control | App Service | SQL DB | Storage | Key Vault |
|---|---|---|---|---|
| Managed Identity | ✅ | ✅ | ✅ | ✅ |
| Private Endpoint | ✅ | ✅ | ❌ | ✅ |
| Diagnostic Logs | ✅ | ❌ | ✅ | ✅ |
| RBAC (no keys) | ✅ | ✅ | ❌ | ✅ |
| Encryption at Rest | ✅ | ✅ | ✅ | ✅ |
Rule: Any ❌ in the matrix is a tracked remediation item with a priority (P0-P3) and SLA.
Rate Limiting
Spread vs. Burst
For API calls and deployment operations:
| Strategy | Pattern | Use When |
|---|---|---|
| Spread | 10 calls/sec evenly spaced | Sustained throughput |
| Burst | 100 calls then wait | Quick batch operations |
Rule: Prefer spread over burst for production workloads. Azure APIs throttle based on request rate, and burst patterns hit throttle limits earlier than spread patterns with the same total throughput.
Retry Pattern
async function withRetry<T>(
fn: () => Promise<T>,
maxRetries = 3,
baseDelay = 1000
): Promise<T> {
for (let attempt = 0; attempt <= maxRetries; attempt++) {
try {
return await fn();
} catch (err: any) {
if (attempt === maxRetries) throw err;
if (err.statusCode === 429) {
// Use Retry-After header if available
const delay = err.headers?.['retry-after']
? parseInt(err.headers['retry-after']) * 1000
: baseDelay * Math.pow(2, attempt);
await new Promise(r => setTimeout(r, delay));
} else {
throw err; // Don't retry non-throttle errors
}
}
}
throw new Error('Unreachable');
}
Multi-Subscription Management
Folder Organization
For organizations with multiple Azure subscriptions:
azure/
├── production/
│ ├── main.bicep
│ └── parameters.prod.json
├── staging/
│ ├── main.bicep
│ └── parameters.staging.json
├── development/
│ └── parameters.dev.json
└── shared/
├── modules/ # Shared Bicep modules
└── policies/ # Azure Policy definitions
Resource Inventory
Before any infrastructure changes, document what exists:
# List all resources in subscription
az resource list --subscription "My Subscription" \
--output table \
--query "[].{Name:name, Type:type, RG:resourceGroup, Location:location}"
# Export to JSON for diff tracking
az resource list --subscription "My Subscription" -o json > inventory.json
Subscription Documentation Template
Each subscription should have a living document with:
| Section | Content |
|---|---|
| Purpose | What this subscription is for |
| Owner | Team/person responsible |
| Budget | Monthly spend limit and alerts |
| Resources | Link to inventory command output |
| Access | RBAC assignments and justification |
| Networking | VNet topology, peering, DNS zones |
Known Gotchas
Mail.Send on Corporate Tenants
Microsoft 365 corporate tenants often block Mail.Send permission for third-party apps. If your app needs to send email:
- Check tenant admin consent policies first
- Consider
SendMailvia Graph with delegated (not application) permissions - Have a fallback (SMTP, SendGrid) for blocked tenants
- Document the limitation clearly for users
Azure CLI Context
# Always verify which subscription is active
az account show --query "{Name:name, Id:id}" -o table
# Set explicitly before operations
az account set --subscription "Target Subscription"
Rule: Never assume the correct subscription is active. Always verify or set explicitly in scripts.
Infrastructure as Code
Bicep Best Practices for Deployment
| Practice | Why |
|---|---|
| Use modules for reusable components | DRY principle |
| Parameters file per environment | Environment isolation |
@secure() decorator for secrets |
Prevents logging |
existing keyword for references |
No accidental recreation |
| What-if before deploy | Catch unintended changes |
# Always preview changes before deploying
az deployment group what-if \
--resource-group myrg \
--template-file main.bicep \
--parameters @parameters.prod.json