privacy-assessment-rails
This skill contains shell command directives (!`command`) that may execute system commands. Review carefully before installing.
Privacy Assessment — Full Codebase Analysis
Prerequisite
Dependency path: !test -d "${CLAUDE_SKILL_DIR}/../privacy-by-design-rails" && echo "${CLAUDE_SKILL_DIR}/../privacy-by-design-rails" || echo "NOT_FOUND"
If the dependency path above is NOT_FOUND, stop and tell the user:
This skill depends on privacy-by-design-rails, which is not installed. Install it with:
npx skills add codeminer42/skills --skill privacy-by-design-rails(add-gfor global installation)
Do not proceed until the dependency is installed.
Before starting, warn the user: "This assessment is token-intensive — it runs a scanner, then reads and analyzes many files across your codebase. Are you sure you want to proceed?" Wait for confirmation before continuing.
Step 0: Context Gathering
Before running any analysis, ask the user:
Before I begin the assessment, are there any privacy particularities I should know about this application? For example:
- Data that would normally be considered PII but is required by regulation (e.g., storing full IP addresses for legal compliance, retaining financial records for tax purposes)
- Special categories of data the app handles (health data, biometric data, data from minors)
- Third-party services that are contractually authorized to receive certain PII
- Known trade-offs or accepted risks the team has already evaluated
- Any other domain-specific context that might affect the assessment
- What markets/countries does this application serve? (e.g., "Brazil only", "EU and US", "Global", specific country names). This determines which privacy regulations and government document types are relevant for PII classification.
Wait for the user's response. Store their answers as context — this may affect severity ratings, whether certain findings are flagged, and what recommendations are appropriate.
The generated report must begin with the following note, placed before the Disclaimer:
Note: This report was generated by an AI assistant. You can discuss any finding with the agent — ask for more context, challenge a severity rating, provide additional information about your use case, or request that findings be re-evaluated. The agent can also help you prioritize and implement the recommended fixes.
The generated report must also include the following disclaimer, placed before the Executive Summary:
Disclaimer: This report is a technical codebase analysis produced by an automated tool and AI assistant. It does not constitute legal advice, has no legal validity, and does not bind Codeminer42 or any party to its findings. This report should not be used in court or as evidence of compliance or non-compliance. All legal matters related to data protection, GDPR, LGPD, or any other privacy regulation should be discussed with a qualified lawyer.
Step 1: Scanner Baseline
If ruby is not found when running the commands below, ask the user how to run Ruby in their environment instead of trying to resolve it yourself.
Run the scanner to get a mechanical baseline. Include the --markets flag with the markets the user provided in Step 0 (e.g., br for Brazil, eu for EU, us for US — comma-separated for multiple):
ruby ${CLAUDE_SKILL_DIR}/../privacy-by-design-rails/scripts/scanner.rb --markets <markets>
The scanner outputs JSON. Parse it to extract:
findings— pre-filled findings with severity, location, and confidence. Flag anymedium-confidence findings for verification in Step 2.inventory— the scanner enumerates models, mailers, mailer templates, jobs, controllers, services, initializers, schema tables with PII columns, ransackable models, audit declarations, external API calls, and JSON endpoints. Use this inventory as the starting point for A3 in Step 2a — only supplement with manual discovery if you notice gaps.checklist— boolean pass/fail/null for each privacy check. You will update these during Step 2.
Step 2: Deep-Dive Analysis (Two-Pass)
Read ${CLAUDE_SKILL_DIR}/assessment-reference.md. The analysis proceeds in two mandatory passes:
Step 2a: Discovery Pass
Process checklist items A1 through A4 only. This builds your complete codebase inventory and verification matrix.
- Execute EVERY enumeration command in A3. Do not skip any.
- Build the verification matrix in A4 with a row for every PII model.
- Do NOT write any findings yet. Do NOT skip ahead to analysis.
- Use ultrathink for extended reasoning about PII classification decisions, applying the Mandatory Classification Rules from
references/pii-definition.md.
When A4 is complete, verify: does every model from the A3 enumeration that has PII columns have a row in the matrix? If not, go back and add the missing rows.
Step 2b: Analysis Pass
Process checklist items B1 through I2 sequentially. For each item:
- Record the outcome: FINDING (with severity), PASS, or N/A
- Update the verification matrix columns as you go (B1 updates
encrypts?, B3 updatesfilter_attributes?, F1 updatesransackable reviewed?) - Do not move to the next item until the current one is resolved
- Use ultrathink for extended reasoning about what is being checked
Additionally:
- User context: Factor in the user's answers from Step 0. If they said certain data must be stored for regulatory reasons, note it as an accepted trade-off rather than a finding.
- Read the rules in
${CLAUDE_SKILL_DIR}/../privacy-by-design-rails/rules/and references in${CLAUDE_SKILL_DIR}/../privacy-by-design-rails/references/for correct patterns and context.
Step 2c: Completeness Gate
Before proceeding to Step 3, verify ALL of the following. If any check fails, go back and fill the gap.
- Every model row in the verification matrix has NO "pending" values
- Every model from A3 with PII columns was checked for: encryption (B1), filter_attributes (B3), ransackable_attributes (F1)
- Every mailer from A3 was checked for: from address (E1), subjects (E2), template bodies (E3)
- Every initializer from A3 was checked for: error reporter config (C1), APM config (C2)
- Every external API call from A3 was checked for PII in payloads (C3)
- Every job/worker from A3 was checked for: log_arguments (D1/D2), PII in perform params (D3)
- Every audit/versioning declaration from A3 was checked for PII in serialized columns (D6)
Only proceed to Step 3 when all checks pass.
Step 3: Assemble Report
Merge scanner findings with your deep-dive findings into a single report.
For every finding, include:
- Rule: rule name from
rules/ - Location:
file:line - Description: what is wrong
- Current code: actual code block from the codebase
- Recommended fix: corrected code block from the rule/reference
- Reference: reference file for deeper context
Important: For every recommended fix that includes code, read the corresponding reference file in ${CLAUDE_SKILL_DIR}/../privacy-by-design-rails/references/ and verify the API calls and configuration patterns are correct. Do not write library API calls from memory — copy patterns from the reference docs.
Finding Granularity
To ensure consistent finding counts across runs, follow these grouping rules:
- Unencrypted PII at rest: One finding per model (not per column). List all affected columns inside the finding. E.g., "User model: email, name, ip_address unencrypted" = 1 finding. Three models with unencrypted PII = 3 findings.
- filter_attributes missing: One finding per model. Three models missing it = 3 findings.
- filter_parameters: Always exactly one finding (the single initializer file), listing all missing fields.
- Error reporters / APM tools: One finding per service (Rollbar = 1, NewRelic = 1).
- External API leaks: One finding per external service (Slack webhooks = 1, OpenAI API = 1).
- Email from:/subject:/body issues: One finding per mailer class (not per method or template). List all affected methods/templates inside the finding.
- Job argument logging: One finding for the base class (
ApplicationJob), plus one per child that explicitly re-enables. Dismiss children that inherit from a protected base. - Data minimization (ransackable_attributes, strong params): One finding per model. If both AdminUser and Staff expose sensitive ransackable_attributes, that's 2 separate findings, not 1 combined finding.
- Audit/log column PII: One finding per audit mechanism (e.g.,
auditedgem = 1 finding covering all affected columns). - Structural safeguards: One finding per missing capability (consent = 1, DSAR = 1, retention = 1).
- Security tooling: One finding per missing tool category (static analysis = 1, dependency scanning = 1, log redaction = 1, IP anonymization = 1).
PII Master List
During Step 2a (Discovery Pass), collect all identified PII fields into a single PII master list. Reuse this exact list consistently across all remediation recommendations — filter_parameters, scrub_fields, filter_attributes, and ransackable_attributes exclusions should all reference the same canonical set of PII field names.
Remediation Defaults
When writing recommended fixes, follow these defaults for consistency:
auditedgem: Always recommend an allowlist approach (audited only: [...]) over expanding the denylist (except: [...])filter_parameters: Always list specific field names (e.g.,:email, :name, :cpf). Never use regex patterns or partial-match wildcards.ransackable_attributes: Always recommend a minimal allowlist of only the fields genuinely needed for admin search.scrub_fields(error reporters): Include all fields from the PII master list.
Authorized External Integrations
When the user confirms an external integration is authorized (Step 0), it must still be a numbered finding — never a separate "Accepted Trade-Offs" or "Context-Dependent" section. Handle as follows:
- The finding exists because PII is being sent to a third party — that fact doesn't change with authorization.
- Severity stays CRITICAL (per the severity table) since PII is leaving the infrastructure. In the description, note that the integration is authorized by the team.
- Focus the recommended fix on data minimization: does the payload include more fields than the service actually needs? Recommend removing unnecessary fields, adding audit logging, and ensuring a Data Processing Agreement (DPA) is in place.
- Do not create ad-hoc sections like "Accepted Trade-Offs", "Context-Dependent Findings", or "Team Evaluation Required". All findings go in the numbered findings list under their severity.
Finding Order
Within each severity level, order findings using these categories. When multiple findings share a category, use alphabetical order by model/service name as the tiebreaker:
- PII to external services (alphabetical by service name)
- Unencrypted PII at rest (alphabetical by model name)
- filter_parameters (always exactly one finding)
- filter_attributes missing (alphabetical by model name)
- Logs/cache/queue — base class first, then children alphabetical, then cache, then session
- Email (alphabetical by mailer class name)
- Audit/versioning columns (alphabetical by mechanism name)
- Structural gaps — consent, then DSAR, then retention (fixed order)
- Tooling gaps — static analysis, dependency scanning, log redaction, IP anonymization (fixed order)
Finding Numbering
Use severity-prefixed sequential numbers: C-1, C-2, ... for critical, H-1, H-2, ... for high, M-1, M-2, ... for medium. Do not use plain sequential numbers (1, 2, 3, ...) or unprefixed labels. Use this format in headings (#### Finding C-1: ...), cross-references, and the recommendations section.
Dismissed findings (false positives from the scanner) should not be counted in the totals. List them in a separate "Dismissed Findings" section after the severity sections. Each dismissed scanner finding must appear as a separate row in the Dismissed Findings table, even if multiple findings share the same dismissal reason. Do not group or summarize dismissed findings — one row per scanner finding.
Report Sections
The final report must use exactly these sections in this order: Header → Note → Disclaimer → Executive Summary → Findings by Severity → Dismissed Findings → Checklist Summary → Recommendations. The Note and Disclaimer (defined in Step 0) are inserted between the Header and Executive Summary. Do not add, rename, or reorganize sections beyond this. Specifically, do not create ad-hoc sections such as "Context-Dependent Findings", "Accepted Trade-Offs", "Team Evaluation Required", "PII Field Inventory", "User-Provided Context", or "Domain Context". User-provided context from Step 0 should inform finding descriptions and recommended fixes, not get its own section. All findings go in the numbered findings list under their severity.
- Executive Summary: X verified findings total (Y critical, Z high, W medium). 2-3 sentences with overall assessment. If false positives were dismissed, add a separate line: "N scanner findings were reviewed and dismissed as false positives — see the Dismissed Findings section for details."
- Checklist Summary: Use these labels for the checklist table: All PII fields encrypted, filter_parameters complete, filter_attributes on models, Job arguments suppressed, No PII in job payloads, No PII in email bodies, No PII in cache, Consent model with PURPOSES, RequiresConsent enforcement, DSAR workflow complete, Exports on-demand (no PII at rest), DSAR vs processing exports separated, force_ssl enabled, logstop configured, IP anonymization, Error reporter scrubbing, Data retention jobs scheduled, Security audit gems present. Update statuses based on your deep-dive (some N/A items may become FAIL; some FAIL items may become PASS after manual verification).
- Recommendations: prioritized action list grouped by severity.
Step 4: Save and Offer Fix Plan
Ask the user where to save the report (default: privacy-assessment.md at project root).
Then offer fix plan strategies:
- Highest priority first — CRITICAL > HIGH > MEDIUM
- Low-hanging fruit — easiest fixes first regardless of severity
- Critical only — just CRITICAL findings
- Everything — all findings ordered by severity
- Custom — user picks specific findings
For encryption changes on existing data, reference ${CLAUDE_SKILL_DIR}/../privacy-by-design-rails/references/encryption.md for brownfield migration phases.