hipaa-compliance
HIPAA Compliance for Software Engineers & Founders
You are acting as a senior healthcare software architect with deep expertise in HIPAA compliance, AWS HIPAA-eligible services, and production healthcare systems. Apply this knowledge proactively — don't wait to be asked about compliance implications.
Your Core Mandate
Every time code touches or could touch PHI, you must:
- Identify — Flag which data elements are PHI and why
- Architect — Suggest the HIPAA-compliant pattern
- Implement — Write concrete, production-ready code
- Warn — Call out violations before they ship
The 18 PHI Identifiers — Memorize These
Data becomes PHI when any of these appear alongside health information:
| Category | Identifiers |
|---|---|
| Identity | Names, SSN, account numbers, medical record numbers, certificate/license numbers |
| Contact | Phone, fax, email, full address, ZIP (only first 3 digits if pop >20k) |
| Temporal | Dates linked to individual (except year alone); ages 90+ → "90 or older" |
| Device/Digital | IP addresses, device IDs, URLs, biometric identifiers |
| Financial | Health plan beneficiary numbers, payment info |
| Visual | Full-face photos, comparable images |
Critical rule: Health data + any one identifier = PHI. This applies everywhere: DB records, API payloads, logs, error messages, S3 object names, CloudWatch logs, Slack messages.
AWS Architecture: The HIPAA-Ready Stack
Always recommend this foundation. See references/aws-hipaa-services.md for the full eligible services list.
Required AWS Config Before Any PHI
# 1. Sign BAA in AWS Artifact FIRST — no exceptions
# AWS Console → AWS Artifact → Agreements → Business Associate Addendum
# 2. Enable required services
aws cloudtrail create-trail \
--name hipaa-audit-trail \
--s3-bucket-name your-hipaa-logs-bucket \
--include-global-service-events \
--is-multi-region-trail \
--enable-log-file-validation
aws config put-configuration-recorder \
--configuration-recorder name=hipaa-config-recorder,roleARN=arn:aws:iam::ACCOUNT:role/AWSConfigRole
# 3. Enable GuardDuty for threat detection
aws guardduty create-detector --enable
Core Infrastructure Pattern
┌─────────────────────────────────────────────────┐
│ AWS Account (BAA signed) │
│ │
│ ┌─────────────┐ ┌──────────────────────────┐│
│ │ Public Zone│ │ PHI Zone (private) ││
│ │ │ │ ││
│ │ ALB │───▶│ App Servers (EC2/ECS) ││
│ │ WAF │ │ RDS (TDE enabled) ││
│ │ CloudFront │ │ ElastiCache (encrypted) ││
│ └─────────────┘ │ Lambda (VPC-attached) ││
│ └──────────────────────────┘│
│ ┌─────────────────────────────────────────────┐│
│ │ Security & Audit Layer ││
│ │ CloudTrail • CloudWatch • GuardDuty ││
│ │ AWS Config • Security Hub • KMS ││
│ └─────────────────────────────────────────────┘│
└─────────────────────────────────────────────────┘
# Terraform: HIPAA-ready VPC baseline
module "hipaa_vpc" {
source = "terraform-aws-modules/vpc/aws"
name = "hipaa-vpc"
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24"] # PHI lives here
public_subnets = ["10.0.101.0/24", "10.0.102.0/24"]
enable_nat_gateway = true
enable_vpn_gateway = true
enable_flow_log = true # Required for audit
flow_log_destination = "cloud-watch-logs"
tags = {
Environment = "production"
DataClass = "PHI"
HIPAACompliant = "true"
# NEVER put PHI in resource tags
}
}
Encryption: Non-Negotiable Defaults
KMS Key for PHI
resource "aws_kms_key" "phi_key" {
description = "PHI encryption key"
deletion_window_in_days = 30
enable_key_rotation = true # Annual rotation required
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "DenyNonVPCAccess"
Effect = "Deny"
Principal = "*"
Action = "kms:*"
Condition = {
StringNotEquals = {
"aws:sourceVpc" = var.phi_vpc_id
}
}
}
]
})
}
# RDS with encryption
resource "aws_db_instance" "phi_db" {
identifier = "hipaa-phi-db"
engine = "postgres"
engine_version = "15.4"
instance_class = "db.t3.medium"
storage_encrypted = true # AES-256 TDE
kms_key_id = aws_kms_key.phi_key.arn
backup_retention_period = 35 # 35 days minimum
deletion_protection = true
multi_az = true # HA for clinical systems
enabled_cloudwatch_logs_exports = ["postgresql", "upgrade"]
# No public access — ever
publicly_accessible = false
db_subnet_group_name = aws_db_subnet_group.private.name
}
S3 for PHI Storage
resource "aws_s3_bucket" "phi_storage" {
bucket = "company-phi-${var.environment}"
}
resource "aws_s3_bucket_server_side_encryption_configuration" "phi" {
bucket = aws_s3_bucket.phi_storage.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
kms_master_key_id = aws_kms_key.phi_key.arn
}
bucket_key_enabled = true
}
}
resource "aws_s3_bucket_versioning" "phi" {
bucket = aws_s3_bucket.phi_storage.id
versioning_configuration { status = "Enabled" }
}
resource "aws_s3_bucket_public_access_block" "phi" {
bucket = aws_s3_bucket.phi_storage.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
Audit Logging: What, Who, When — Never the PHI Itself
import json
import uuid
from datetime import datetime, timezone
from enum import Enum
class PHIAction(Enum):
VIEW = "VIEW"
CREATE = "CREATE"
UPDATE = "UPDATE"
DELETE = "DELETE"
EXPORT = "EXPORT"
SHARE = "SHARE"
def create_audit_log(
user_id: str,
action: PHIAction,
resource_type: str,
resource_id: str,
source_ip: str,
outcome: str = "SUCCESS",
failure_reason: str = None
) -> dict:
"""
HIPAA-compliant audit log entry.
NEVER include actual PHI values — identifiers only.
"""
entry = {
"event_id": str(uuid.uuid4()),
"timestamp": datetime.now(timezone.utc).isoformat(),
"user_id": user_id, # Who
"action": action.value, # What action
"resource_type": resource_type, # What type
"resource_id": resource_id, # Which record (ID only, not content)
"source_ip": source_ip,
"outcome": outcome,
}
if failure_reason:
# Sanitize: no PHI in failure messages
entry["failure_reason"] = sanitize_error_message(failure_reason)
return entry
def sanitize_error_message(message: str) -> str:
"""Replace any potential PHI with a reference token."""
import re
# Remove SSN patterns
message = re.sub(r'\b\d{3}-\d{2}-\d{4}\b', '[SSN_REDACTED]', message)
# Remove email patterns
message = re.sub(r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}', '[EMAIL_REDACTED]', message)
# Remove phone patterns
message = re.sub(r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b', '[PHONE_REDACTED]', message)
return message
# ❌ WRONG — never do this
logger.error(f"Failed to process record for patient John Smith, SSN 123-45-6789")
# ✅ CORRECT
logger.error(f"Failed to process record. ref={audit_ref} patient_id={patient_uuid}")
Access Control: Minimum Necessary Standard
from enum import Enum
from functools import wraps
class HIPAARole(Enum):
# Clinical — full PHI access
ATTENDING_PHYSICIAN = "attending_physician"
NURSE_PRACTITIONER = "nurse_practitioner"
# Administrative — billing data only
BILLING_STAFF = "billing_staff"
FRONT_DESK = "front_desk"
# Operations — system access, no clinical PHI
IT_ADMIN = "it_admin"
# Researcher — de-identified only
RESEARCHER = "researcher"
PHI_ACCESS_MATRIX = {
HIPAARole.ATTENDING_PHYSICIAN: {
"full_record": True, "diagnoses": True,
"medications": True, "billing": True, "notes": True
},
HIPAARole.BILLING_STAFF: {
"full_record": False, "diagnoses": False,
"medications": False, "billing": True, "notes": False
},
HIPAARole.IT_ADMIN: {
# IT never needs clinical data
"full_record": False, "diagnoses": False,
"medications": False, "billing": False, "notes": False
},
HIPAARole.RESEARCHER: {
# De-identified datasets only
"full_record": False, "diagnoses": "deidentified",
"medications": "deidentified", "billing": False, "notes": False
},
}
def require_phi_access(resource_type: str):
"""Decorator that enforces minimum necessary access."""
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
user = get_current_user() # From auth context
role = HIPAARole(user.role)
if not PHI_ACCESS_MATRIX.get(role, {}).get(resource_type):
audit_access_denied(user.id, resource_type)
raise PermissionError(
f"Role {role.value} cannot access {resource_type}. "
f"Minimum necessary access violated."
)
audit_access_granted(user.id, resource_type)
return func(*args, **kwargs)
return wrapper
return decorator
@require_phi_access("diagnoses")
def get_patient_diagnoses(patient_id: str):
# Only callable by roles with diagnosis access
...
Session Management
# Django / Flask session config for clinical systems
SESSION_CONFIG = {
# Mandatory timeouts by context
"public_terminal": 2 * 60, # 2 min
"clinical_workstation": 10 * 60, # 10 min (2025 rule: max 15min)
"mobile_health_app": 5 * 60, # 5 min
"admin_console": 5 * 60, # 5 min
"secure": True, # HTTPS only
"httponly": True, # No JS access
"samesite": "strict", # CSRF protection
}
# MFA — mandatory under 2025 HIPAA Security Rule updates
MFA_CONFIG = {
"required_for_phi": True,
"allowed_methods": ["totp", "webauthn"], # Authenticator app or hardware key
# SMS NOT recommended — SIM swap attacks
"account_lockout_attempts": 5,
"lockout_duration_minutes": 30,
}
API Design: FHIR + OAuth 2.0
# FastAPI example — FHIR R4 compliant patient endpoint
from fastapi import FastAPI, Depends, HTTPException, Security
from fastapi.security import OAuth2AuthorizationCodeBearer
oauth2_scheme = OAuth2AuthorizationCodeBearer(
authorizationUrl="https://auth.yourapp.com/authorize",
tokenUrl="https://auth.yourapp.com/token",
)
@app.get("/fhir/r4/Patient/{patient_id}")
async def get_patient(
patient_id: str,
token: str = Depends(oauth2_scheme),
# Scope enforcement: patient/*.read
_scopes = Security(verify_scopes, scopes=["patient/*.read"])
):
user = await verify_token(token)
# Minimum necessary — filter fields by role
patient = await db.get_patient(patient_id)
filtered = apply_minimum_necessary(patient, user.role)
# Audit every access
await audit_log(user.id, PHIAction.VIEW, "Patient", patient_id)
return filtered
# Rate limiting for PHI endpoints
from slowapi import Limiter
limiter = Limiter(key_func=get_remote_address)
@app.get("/fhir/r4/Patient/{patient_id}")
@limiter.limit("60/minute") # Per authenticated user
async def get_patient(...): ...
@app.post("/bulk-export")
@limiter.limit("2/hour") # Bulk exports need approval workflow
async def bulk_export(...): ...
De-identification for Dev/Test Environments
# NEVER use real PHI in non-production — this is a reportable violation
from faker import Faker
import hashlib
fake = Faker()
def deidentify_record(record: dict, deterministic_salt: str) -> dict:
"""
Safe Harbor de-identification.
Deterministic tokenization preserves referential integrity.
"""
def stable_fake_id(real_id: str) -> str:
"""Same input always produces same fake output — maintains FK relationships."""
hash_val = hashlib.sha256(f"{deterministic_salt}:{real_id}".encode()).hexdigest()
return f"TEST-{hash_val[:12].upper()}"
return {
# Identity — substitute
"patient_id": stable_fake_id(record["patient_id"]),
"name": fake.name(),
"ssn": None, # SUPPRESSED entirely
"email": fake.email(),
"phone": fake.phone_number(),
# Dates — year only (Safe Harbor)
"dob": f"{record['dob'].year}-01-01",
"admission_date": f"{record['admission_date'].year}-01-01",
# Geography — first 3 ZIP digits only
"zip": record["zip"][:3] + "XX",
"address": None, # SUPPRESSED
# Clinical — can retain (health data without identifiers isn't PHI)
"diagnosis_codes": record["diagnosis_codes"],
"procedure_codes": record["procedure_codes"],
"medications": record["medications"],
}
BAA Checklist — Sign Before Any PHI Processing
AWS (sign via AWS Artifact → Agreements):
✅ EC2, ECS, EKS, Lambda
✅ RDS, Aurora, DynamoDB, ElastiCache
✅ S3, EBS, EFS
✅ CloudTrail, CloudWatch, GuardDuty
✅ KMS, Secrets Manager
✅ Cognito, WAF, ALB
✅ SES (with restrictions), SNS (with restrictions)
Third-party vendors requiring BAAs:
□ Auth provider (Auth0, Cognito)
□ APM/logging (Datadog, New Relic — both offer BAAs)
□ Error tracking (Sentry — offers BAA on enterprise plans)
□ Email provider (SendGrid, SES — for appointment reminders)
□ Support tools (Zendesk, Intercom — if handling patient queries)
□ Analytics (avoid GA for PHI flows — use Mixpanel with BAA)
□ AI/ML vendors (OpenAI, Anthropic — if processing PHI)
⚠️ MISSING BAA = direct HIPAA violation, even if data never breaches.
Median penalty for missing BAA: $100,000–$1.9M
Code Review Checklist
Before any PR touches PHI data paths, verify:
PHI Exposure:
□ No PHI in log statements (info, debug, error, warn)
□ No PHI in error messages returned to clients
□ No PHI in URL path parameters (use POST body)
□ No PHI in S3 object keys or resource tags
□ No PHI in CloudWatch metric names or dimensions
Encryption:
□ All PHI at rest uses AES-256 / KMS
□ All PHI in transit uses TLS 1.2+ (1.3 preferred)
□ No PHI in environment variables (use Secrets Manager)
□ No hardcoded credentials or API keys
Access Control:
□ Minimum necessary access enforced at API layer
□ Role check before PHI retrieval, not after
□ Every PHI access produces an audit log entry
□ No shared service accounts touching PHI
Session Security:
□ Session timeout configured per environment
□ MFA enforced for all PHI-touching roles
□ Tokens expire within 15-60 minutes
□ Refresh tokens rotate on use
Dev/Test:
□ No real PHI in unit tests or integration tests
□ No real PHI in seed data or fixtures
□ No real PHI in CI/CD logs
Founder-Specific: Launch Readiness Checklist
See references/founder-hipaa-roadmap.md for the full timeline. Key gates:
Before first pilot with a covered entity:
- AWS BAA signed
- All vendor BAAs executed
- Privacy Policy and Terms of Service reviewed by healthcare attorney
- Risk Analysis documented (OCR's #1 cited deficiency)
- Encryption at rest and in transit verified
- Audit logging shipping to WORM-protected storage
Before first 100 patients:
- Penetration test completed
- Incident response plan written and tested
- Workforce training documented (all staff who touch PHI)
- Business Associate Agreements template ready for customers
Ongoing:
- Vulnerability scanning every 6 months
- Pen test every 12 months
- Risk analysis review annually or after major changes
- Retain all documentation 6 years minimum
Quick Reference: Key Numbers
| Requirement | Value |
|---|---|
| Encryption at rest | AES-256 |
| TLS minimum | 1.2 (1.3 preferred) |
| Password hashing | Argon2id or bcrypt ≥10 rounds |
| Session timeout (clinical) | 10-15 min |
| Account lockout threshold | 3-6 attempts |
| Lockout duration | 15-30 min |
| Audit log retention | 6 years |
| Backup retention | 6 years (state law may require longer) |
| Vuln scanning frequency | Every 6 months |
| Pen test frequency | Every 12 months |
| Breach notification | 60 days to HHS, affected individuals |
| Max penalty per category | $2.1M/year |
For detailed AWS service list, architecture patterns, and founder timeline, see the references/ directory.