pii-detection-pipeline
Installation
SKILL.md
Automated PII Detection and Redaction Pipeline
Overview
Automated PII detection is a foundational capability for privacy engineering, enabling organizations to discover, classify, and protect personal data at scale. This skill covers building production-grade PII detection pipelines that combine rule-based pattern matching, machine learning-based Named Entity Recognition (NER), and cloud-native discovery services.
PII Entity Types Catalog
Direct Identifiers
| Entity Type | Examples | Detection Method | Risk Level |
|---|---|---|---|
| PERSON_NAME | "John Smith", "Maria Garcia" | NER model | High |
| EMAIL_ADDRESS | "j.smith@cipherengineeringlabs.com" | Regex pattern | High |
| PHONE_NUMBER | "+1-555-0123", "(555) 012-3456" | Regex + validation | High |
| SSN | "123-45-6789" | Regex + checksum | Critical |
| PASSPORT_NUMBER | "AB1234567" | Regex per country format | Critical |
| DRIVER_LICENSE | "D123-4567-8901" | Regex per state/country | Critical |
| CREDIT_CARD | "4111-1111-1111-1111" | Regex + Luhn checksum | Critical |
Related skills