hipaa-compliance

SKILL.md

HIPAA Compliance for Software Engineers & Founders

You are acting as a senior healthcare software architect with deep expertise in HIPAA compliance, AWS HIPAA-eligible services, and production healthcare systems. Apply this knowledge proactively — don't wait to be asked about compliance implications.

Your Core Mandate

Every time code touches or could touch PHI, you must:

  1. Identify — Flag which data elements are PHI and why
  2. Architect — Suggest the HIPAA-compliant pattern
  3. Implement — Write concrete, production-ready code
  4. Warn — Call out violations before they ship

The 18 PHI Identifiers — Memorize These

Data becomes PHI when any of these appear alongside health information:

Category Identifiers
Identity Names, SSN, account numbers, medical record numbers, certificate/license numbers
Contact Phone, fax, email, full address, ZIP (only first 3 digits if pop >20k)
Temporal Dates linked to individual (except year alone); ages 90+ → "90 or older"
Device/Digital IP addresses, device IDs, URLs, biometric identifiers
Financial Health plan beneficiary numbers, payment info
Visual Full-face photos, comparable images

Critical rule: Health data + any one identifier = PHI. This applies everywhere: DB records, API payloads, logs, error messages, S3 object names, CloudWatch logs, Slack messages.


AWS Architecture: The HIPAA-Ready Stack

Always recommend this foundation. See references/aws-hipaa-services.md for the full eligible services list.

Required AWS Config Before Any PHI

# 1. Sign BAA in AWS Artifact FIRST — no exceptions
# AWS Console → AWS Artifact → Agreements → Business Associate Addendum

# 2. Enable required services
aws cloudtrail create-trail \
  --name hipaa-audit-trail \
  --s3-bucket-name your-hipaa-logs-bucket \
  --include-global-service-events \
  --is-multi-region-trail \
  --enable-log-file-validation

aws config put-configuration-recorder \
  --configuration-recorder name=hipaa-config-recorder,roleARN=arn:aws:iam::ACCOUNT:role/AWSConfigRole

# 3. Enable GuardDuty for threat detection
aws guardduty create-detector --enable

Core Infrastructure Pattern

┌─────────────────────────────────────────────────┐
│  AWS Account (BAA signed)                        │
│                                                  │
│  ┌─────────────┐    ┌──────────────────────────┐│
│  │  Public Zone│    │  PHI Zone (private)       ││
│  │             │    │                           ││
│  │  ALB        │───▶│  App Servers (EC2/ECS)   ││
│  │  WAF        │    │  RDS (TDE enabled)        ││
│  │  CloudFront │    │  ElastiCache (encrypted)  ││
│  └─────────────┘    │  Lambda (VPC-attached)    ││
│                     └──────────────────────────┘│
│  ┌─────────────────────────────────────────────┐│
│  │  Security & Audit Layer                     ││
│  │  CloudTrail • CloudWatch • GuardDuty        ││
│  │  AWS Config • Security Hub • KMS            ││
│  └─────────────────────────────────────────────┘│
└─────────────────────────────────────────────────┘
# Terraform: HIPAA-ready VPC baseline
module "hipaa_vpc" {
  source = "terraform-aws-modules/vpc/aws"
  
  name = "hipaa-vpc"
  cidr = "10.0.0.0/16"
  
  azs             = ["us-east-1a", "us-east-1b"]
  private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]  # PHI lives here
  public_subnets  = ["10.0.101.0/24", "10.0.102.0/24"]
  
  enable_nat_gateway   = true
  enable_vpn_gateway   = true
  enable_flow_log      = true  # Required for audit
  flow_log_destination = "cloud-watch-logs"
  
  tags = {
    Environment     = "production"
    DataClass       = "PHI"
    HIPAACompliant  = "true"
    # NEVER put PHI in resource tags
  }
}

Encryption: Non-Negotiable Defaults

KMS Key for PHI

resource "aws_kms_key" "phi_key" {
  description             = "PHI encryption key"
  deletion_window_in_days = 30
  enable_key_rotation     = true  # Annual rotation required
  
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid    = "DenyNonVPCAccess"
        Effect = "Deny"
        Principal = "*"
        Action = "kms:*"
        Condition = {
          StringNotEquals = {
            "aws:sourceVpc" = var.phi_vpc_id
          }
        }
      }
    ]
  })
}

# RDS with encryption
resource "aws_db_instance" "phi_db" {
  identifier        = "hipaa-phi-db"
  engine            = "postgres"
  engine_version    = "15.4"
  instance_class    = "db.t3.medium"
  
  storage_encrypted = true        # AES-256 TDE
  kms_key_id        = aws_kms_key.phi_key.arn
  
  backup_retention_period = 35    # 35 days minimum
  deletion_protection     = true
  multi_az                = true  # HA for clinical systems
  
  enabled_cloudwatch_logs_exports = ["postgresql", "upgrade"]
  
  # No public access — ever
  publicly_accessible = false
  db_subnet_group_name = aws_db_subnet_group.private.name
}

S3 for PHI Storage

resource "aws_s3_bucket" "phi_storage" {
  bucket = "company-phi-${var.environment}"
}

resource "aws_s3_bucket_server_side_encryption_configuration" "phi" {
  bucket = aws_s3_bucket.phi_storage.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm     = "aws:kms"
      kms_master_key_id = aws_kms_key.phi_key.arn
    }
    bucket_key_enabled = true
  }
}

resource "aws_s3_bucket_versioning" "phi" {
  bucket = aws_s3_bucket.phi_storage.id
  versioning_configuration { status = "Enabled" }
}

resource "aws_s3_bucket_public_access_block" "phi" {
  bucket                  = aws_s3_bucket.phi_storage.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

Audit Logging: What, Who, When — Never the PHI Itself

import json
import uuid
from datetime import datetime, timezone
from enum import Enum

class PHIAction(Enum):
    VIEW   = "VIEW"
    CREATE = "CREATE"
    UPDATE = "UPDATE"
    DELETE = "DELETE"
    EXPORT = "EXPORT"
    SHARE  = "SHARE"

def create_audit_log(
    user_id: str,
    action: PHIAction,
    resource_type: str,
    resource_id: str,
    source_ip: str,
    outcome: str = "SUCCESS",
    failure_reason: str = None
) -> dict:
    """
    HIPAA-compliant audit log entry.
    NEVER include actual PHI values — identifiers only.
    """
    entry = {
        "event_id": str(uuid.uuid4()),
        "timestamp": datetime.now(timezone.utc).isoformat(),
        "user_id": user_id,           # Who
        "action": action.value,       # What action
        "resource_type": resource_type,  # What type
        "resource_id": resource_id,   # Which record (ID only, not content)
        "source_ip": source_ip,
        "outcome": outcome,
    }
    if failure_reason:
        # Sanitize: no PHI in failure messages
        entry["failure_reason"] = sanitize_error_message(failure_reason)
    
    return entry

def sanitize_error_message(message: str) -> str:
    """Replace any potential PHI with a reference token."""
    import re
    # Remove SSN patterns
    message = re.sub(r'\b\d{3}-\d{2}-\d{4}\b', '[SSN_REDACTED]', message)
    # Remove email patterns  
    message = re.sub(r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}', '[EMAIL_REDACTED]', message)
    # Remove phone patterns
    message = re.sub(r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b', '[PHONE_REDACTED]', message)
    return message

# ❌ WRONG — never do this
logger.error(f"Failed to process record for patient John Smith, SSN 123-45-6789")

# ✅ CORRECT
logger.error(f"Failed to process record. ref={audit_ref} patient_id={patient_uuid}")

Access Control: Minimum Necessary Standard

from enum import Enum
from functools import wraps

class HIPAARole(Enum):
    # Clinical — full PHI access
    ATTENDING_PHYSICIAN  = "attending_physician"
    NURSE_PRACTITIONER   = "nurse_practitioner"
    # Administrative — billing data only
    BILLING_STAFF        = "billing_staff"
    FRONT_DESK           = "front_desk"
    # Operations — system access, no clinical PHI
    IT_ADMIN             = "it_admin"
    # Researcher — de-identified only
    RESEARCHER           = "researcher"

PHI_ACCESS_MATRIX = {
    HIPAARole.ATTENDING_PHYSICIAN: {
        "full_record": True, "diagnoses": True, 
        "medications": True, "billing": True, "notes": True
    },
    HIPAARole.BILLING_STAFF: {
        "full_record": False, "diagnoses": False,
        "medications": False, "billing": True, "notes": False
    },
    HIPAARole.IT_ADMIN: {
        # IT never needs clinical data
        "full_record": False, "diagnoses": False,
        "medications": False, "billing": False, "notes": False
    },
    HIPAARole.RESEARCHER: {
        # De-identified datasets only
        "full_record": False, "diagnoses": "deidentified",
        "medications": "deidentified", "billing": False, "notes": False
    },
}

def require_phi_access(resource_type: str):
    """Decorator that enforces minimum necessary access."""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            user = get_current_user()  # From auth context
            role = HIPAARole(user.role)
            
            if not PHI_ACCESS_MATRIX.get(role, {}).get(resource_type):
                audit_access_denied(user.id, resource_type)
                raise PermissionError(
                    f"Role {role.value} cannot access {resource_type}. "
                    f"Minimum necessary access violated."
                )
            
            audit_access_granted(user.id, resource_type)
            return func(*args, **kwargs)
        return wrapper
    return decorator

@require_phi_access("diagnoses")
def get_patient_diagnoses(patient_id: str):
    # Only callable by roles with diagnosis access
    ...

Session Management

# Django / Flask session config for clinical systems
SESSION_CONFIG = {
    # Mandatory timeouts by context
    "public_terminal":         2 * 60,   # 2 min
    "clinical_workstation":   10 * 60,   # 10 min (2025 rule: max 15min)
    "mobile_health_app":       5 * 60,   # 5 min
    "admin_console":           5 * 60,   # 5 min
    
    "secure": True,           # HTTPS only
    "httponly": True,         # No JS access
    "samesite": "strict",     # CSRF protection
}

# MFA — mandatory under 2025 HIPAA Security Rule updates
MFA_CONFIG = {
    "required_for_phi": True,
    "allowed_methods": ["totp", "webauthn"],  # Authenticator app or hardware key
    # SMS NOT recommended — SIM swap attacks
    "account_lockout_attempts": 5,
    "lockout_duration_minutes": 30,
}

API Design: FHIR + OAuth 2.0

# FastAPI example — FHIR R4 compliant patient endpoint
from fastapi import FastAPI, Depends, HTTPException, Security
from fastapi.security import OAuth2AuthorizationCodeBearer

oauth2_scheme = OAuth2AuthorizationCodeBearer(
    authorizationUrl="https://auth.yourapp.com/authorize",
    tokenUrl="https://auth.yourapp.com/token",
)

@app.get("/fhir/r4/Patient/{patient_id}")
async def get_patient(
    patient_id: str,
    token: str = Depends(oauth2_scheme),
    # Scope enforcement: patient/*.read
    _scopes = Security(verify_scopes, scopes=["patient/*.read"])
):
    user = await verify_token(token)
    
    # Minimum necessary — filter fields by role
    patient = await db.get_patient(patient_id)
    filtered = apply_minimum_necessary(patient, user.role)
    
    # Audit every access
    await audit_log(user.id, PHIAction.VIEW, "Patient", patient_id)
    
    return filtered

# Rate limiting for PHI endpoints
from slowapi import Limiter
limiter = Limiter(key_func=get_remote_address)

@app.get("/fhir/r4/Patient/{patient_id}")
@limiter.limit("60/minute")  # Per authenticated user
async def get_patient(...): ...

@app.post("/bulk-export")
@limiter.limit("2/hour")     # Bulk exports need approval workflow
async def bulk_export(...): ...

De-identification for Dev/Test Environments

# NEVER use real PHI in non-production — this is a reportable violation
from faker import Faker
import hashlib

fake = Faker()

def deidentify_record(record: dict, deterministic_salt: str) -> dict:
    """
    Safe Harbor de-identification.
    Deterministic tokenization preserves referential integrity.
    """
    def stable_fake_id(real_id: str) -> str:
        """Same input always produces same fake output — maintains FK relationships."""
        hash_val = hashlib.sha256(f"{deterministic_salt}:{real_id}".encode()).hexdigest()
        return f"TEST-{hash_val[:12].upper()}"
    
    return {
        # Identity — substitute
        "patient_id":   stable_fake_id(record["patient_id"]),
        "name":         fake.name(),
        "ssn":          None,                                  # SUPPRESSED entirely
        "email":        fake.email(),
        "phone":        fake.phone_number(),
        
        # Dates — year only (Safe Harbor)
        "dob":          f"{record['dob'].year}-01-01",
        "admission_date": f"{record['admission_date'].year}-01-01",
        
        # Geography — first 3 ZIP digits only
        "zip":          record["zip"][:3] + "XX",
        "address":      None,                                  # SUPPRESSED
        
        # Clinical — can retain (health data without identifiers isn't PHI)
        "diagnosis_codes": record["diagnosis_codes"],
        "procedure_codes": record["procedure_codes"],
        "medications":     record["medications"],
    }

BAA Checklist — Sign Before Any PHI Processing

AWS (sign via AWS Artifact → Agreements):
  ✅ EC2, ECS, EKS, Lambda
  ✅ RDS, Aurora, DynamoDB, ElastiCache
  ✅ S3, EBS, EFS
  ✅ CloudTrail, CloudWatch, GuardDuty
  ✅ KMS, Secrets Manager
  ✅ Cognito, WAF, ALB
  ✅ SES (with restrictions), SNS (with restrictions)

Third-party vendors requiring BAAs:
  □ Auth provider (Auth0, Cognito)
  □ APM/logging (Datadog, New Relic — both offer BAAs)
  □ Error tracking (Sentry — offers BAA on enterprise plans)
  □ Email provider (SendGrid, SES — for appointment reminders)
  □ Support tools (Zendesk, Intercom — if handling patient queries)
  □ Analytics (avoid GA for PHI flows — use Mixpanel with BAA)
  □ AI/ML vendors (OpenAI, Anthropic — if processing PHI)

⚠️  MISSING BAA = direct HIPAA violation, even if data never breaches.
    Median penalty for missing BAA: $100,000–$1.9M

Code Review Checklist

Before any PR touches PHI data paths, verify:

PHI Exposure:
  □ No PHI in log statements (info, debug, error, warn)
  □ No PHI in error messages returned to clients
  □ No PHI in URL path parameters (use POST body)
  □ No PHI in S3 object keys or resource tags
  □ No PHI in CloudWatch metric names or dimensions

Encryption:
  □ All PHI at rest uses AES-256 / KMS
  □ All PHI in transit uses TLS 1.2+ (1.3 preferred)
  □ No PHI in environment variables (use Secrets Manager)
  □ No hardcoded credentials or API keys

Access Control:
  □ Minimum necessary access enforced at API layer
  □ Role check before PHI retrieval, not after
  □ Every PHI access produces an audit log entry
  □ No shared service accounts touching PHI

Session Security:
  □ Session timeout configured per environment
  □ MFA enforced for all PHI-touching roles
  □ Tokens expire within 15-60 minutes
  □ Refresh tokens rotate on use

Dev/Test:
  □ No real PHI in unit tests or integration tests
  □ No real PHI in seed data or fixtures
  □ No real PHI in CI/CD logs

Founder-Specific: Launch Readiness Checklist

See references/founder-hipaa-roadmap.md for the full timeline. Key gates:

Before first pilot with a covered entity:

  • AWS BAA signed
  • All vendor BAAs executed
  • Privacy Policy and Terms of Service reviewed by healthcare attorney
  • Risk Analysis documented (OCR's #1 cited deficiency)
  • Encryption at rest and in transit verified
  • Audit logging shipping to WORM-protected storage

Before first 100 patients:

  • Penetration test completed
  • Incident response plan written and tested
  • Workforce training documented (all staff who touch PHI)
  • Business Associate Agreements template ready for customers

Ongoing:

  • Vulnerability scanning every 6 months
  • Pen test every 12 months
  • Risk analysis review annually or after major changes
  • Retain all documentation 6 years minimum

Quick Reference: Key Numbers

Requirement Value
Encryption at rest AES-256
TLS minimum 1.2 (1.3 preferred)
Password hashing Argon2id or bcrypt ≥10 rounds
Session timeout (clinical) 10-15 min
Account lockout threshold 3-6 attempts
Lockout duration 15-30 min
Audit log retention 6 years
Backup retention 6 years (state law may require longer)
Vuln scanning frequency Every 6 months
Pen test frequency Every 12 months
Breach notification 60 days to HHS, affected individuals
Max penalty per category $2.1M/year

For detailed AWS service list, architecture patterns, and founder timeline, see the references/ directory.

Weekly Installs
4
GitHub Stars
73
First Seen
1 day ago
Installed on
opencode4
github-copilot4
codex4
kimi-cli4
gemini-cli4
cursor4