prowler-api
When to Use
Use this skill for Prowler-specific patterns:
- Row-Level Security (RLS) / tenant isolation
- RBAC permissions and role checks
- Provider lifecycle and validation
- Celery tasks with tenant context
- Multi-database architecture (4-database setup)
For generic DRF patterns (ViewSets, Serializers, Filters, JSON:API), use django-drf skill.
Critical Rules
- ALWAYS use
rls_transaction(tenant_id)when querying outside ViewSet context - ALWAYS use
get_role()before checking permissions (returns FIRST role only) - ALWAYS use
@set_tenantthen@handle_provider_deletiondecorator order - ALWAYS use explicit through models for M2M relationships (required for RLS)
- NEVER access
Provider.objectswithout RLS context in Celery tasks - NEVER bypass RLS by using raw SQL or
connection.cursor() - NEVER use Django's default M2M - RLS requires through models with
tenant_id
Note:
rls_transaction()accepts both UUID objects and strings - it converts internally viastr(value).
Architecture Overview
4-Database Architecture
| Database | Alias | Purpose | RLS |
|---|---|---|---|
default |
prowler_user |
Standard API queries | Yes |
admin |
admin |
Migrations, auth bypass | No |
replica |
prowler_user |
Read-only queries | Yes |
admin_replica |
admin |
Admin read replica | No |
# When to use admin (bypasses RLS)
from api.db_router import MainRouter
User.objects.using(MainRouter.admin_db).get(id=user_id) # Auth lookups
# Standard queries use default (RLS enforced)
Provider.objects.filter(connected=True) # Requires rls_transaction context
RLS Transaction Flow
Request → Authentication → BaseRLSViewSet.initial()
│
├─ Extract tenant_id from JWT
├─ SET api.tenant_id = 'uuid' (PostgreSQL)
└─ All queries now tenant-scoped
Implementation Checklist
When implementing Prowler-specific API features:
| # | Pattern | Reference | Key Points |
|---|---|---|---|
| 1 | RLS Models | api/rls.py |
Inherit RowLevelSecurityProtectedModel, add constraint |
| 2 | RLS Transactions | api/db_utils.py |
Use rls_transaction(tenant_id) context manager |
| 3 | RBAC Permissions | api/rbac/permissions.py |
get_role(), get_providers(), Permissions enum |
| 4 | Provider Validation | api/models.py |
validate_<provider>_uid() methods on Provider model |
| 5 | Celery Tasks | tasks/tasks.py, api/decorators.py, config/celery.py |
Task definitions, decorators (@set_tenant, @handle_provider_deletion), RLSTask base |
| 6 | RLS Serializers | api/v1/serializers.py |
Inherit RLSSerializer to auto-inject tenant_id |
| 7 | Through Models | api/models.py |
ALL M2M must use explicit through with tenant_id |
Full file paths: See references/file-locations.md
Decision Trees
Which Base Model?
Tenant-scoped data → RowLevelSecurityProtectedModel
Global/shared data → models.Model + BaseSecurityConstraint (rare)
Partitioned time-series → PostgresPartitionedModel + RowLevelSecurityProtectedModel
Soft-deletable → Add is_deleted + ActiveProviderManager
Which Manager?
Normal queries → Model.objects (excludes deleted)
Include deleted records → Model.all_objects
Celery task context → Must use rls_transaction() first
Which Database?
Standard API queries → default (automatic via ViewSet)
Read-only operations → replica (automatic for GET in BaseRLSViewSet)
Auth/admin operations → MainRouter.admin_db
Cross-tenant lookups → MainRouter.admin_db (use sparingly!)
Celery Task Decorator Order?
@shared_task(base=RLSTask, name="...", queue="...")
@set_tenant # First: sets tenant context
@handle_provider_deletion # Second: handles deleted providers
def my_task(tenant_id, provider_id):
pass
RLS Model Pattern
from api.rls import RowLevelSecurityProtectedModel, RowLevelSecurityConstraint
class MyModel(RowLevelSecurityProtectedModel):
# tenant FK inherited from parent
id = models.UUIDField(primary_key=True, default=uuid4, editable=False)
name = models.CharField(max_length=255)
inserted_at = models.DateTimeField(auto_now_add=True, editable=False)
updated_at = models.DateTimeField(auto_now=True, editable=False)
class Meta(RowLevelSecurityProtectedModel.Meta):
db_table = "my_models"
constraints = [
RowLevelSecurityConstraint(
field="tenant_id",
name="rls_on_%(class)s",
statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
),
]
class JSONAPIMeta:
resource_name = "my-models"
M2M Relationships (MUST use through models)
class Resource(RowLevelSecurityProtectedModel):
tags = models.ManyToManyField(
ResourceTag,
through="ResourceTagMapping", # REQUIRED for RLS
)
class ResourceTagMapping(RowLevelSecurityProtectedModel):
# Through model MUST have tenant_id for RLS
resource = models.ForeignKey(Resource, on_delete=models.CASCADE)
tag = models.ForeignKey(ResourceTag, on_delete=models.CASCADE)
class Meta:
constraints = [
RowLevelSecurityConstraint(
field="tenant_id",
name="rls_on_%(class)s",
statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
),
]
Async Task Response Pattern (202 Accepted)
For long-running operations, return 202 with task reference:
@action(detail=True, methods=["post"], url_name="connection")
def connection(self, request, pk=None):
with transaction.atomic():
task = check_provider_connection_task.delay(
provider_id=pk, tenant_id=self.request.tenant_id
)
prowler_task = Task.objects.get(id=task.id)
serializer = TaskSerializer(prowler_task)
return Response(
data=serializer.data,
status=status.HTTP_202_ACCEPTED,
headers={"Content-Location": reverse("task-detail", kwargs={"pk": prowler_task.id})}
)
Providers (11 Supported)
| Provider | UID Format | Example |
|---|---|---|
| AWS | 12 digits | 123456789012 |
| Azure | UUID v4 | a1b2c3d4-e5f6-... |
| GCP | 6-30 chars, lowercase, letter start | my-gcp-project |
| M365 | Valid domain | contoso.onmicrosoft.com |
| Kubernetes | 2-251 chars | arn:aws:eks:... |
| GitHub | 1-39 chars | my-org |
| IaC | Git URL | https://github.com/user/repo.git |
| Oracle Cloud | OCID format | ocid1.tenancy.oc1.. |
| MongoDB Atlas | 24-char hex | 507f1f77bcf86cd799439011 |
| Alibaba Cloud | 16 digits | 1234567890123456 |
Adding new provider: Add to ProviderChoices enum + create validate_<provider>_uid() staticmethod.
RBAC Permissions
| Permission | Controls |
|---|---|
MANAGE_USERS |
User CRUD, role assignments |
MANAGE_ACCOUNT |
Tenant settings |
MANAGE_BILLING |
Billing/subscription |
MANAGE_PROVIDERS |
Provider CRUD |
MANAGE_INTEGRATIONS |
Integration config |
MANAGE_SCANS |
Scan execution |
UNLIMITED_VISIBILITY |
See all providers (bypasses provider_groups) |
RBAC Visibility Pattern
def get_queryset(self):
user_role = get_role(self.request.user)
if user_role.unlimited_visibility:
return Model.objects.filter(tenant_id=self.request.tenant_id)
else:
# Filter by provider_groups assigned to role
return Model.objects.filter(provider__in=get_providers(user_role))
Celery Queues
| Queue | Purpose |
|---|---|
scans |
Prowler scan execution |
overview |
Dashboard aggregations (severity, attack surface) |
compliance |
Compliance report generation |
integrations |
External integrations (Jira, S3, Security Hub) |
deletion |
Provider/tenant deletion (async) |
backfill |
Historical data backfill operations |
scan-reports |
Output generation (CSV, JSON, HTML, PDF) |
Task Composition (Canvas)
Use Celery's Canvas primitives for complex workflows:
| Primitive | Use For |
|---|---|
chain() |
Sequential execution: A → B → C |
group() |
Parallel execution: A, B, C simultaneously |
| Combined | Chain with nested groups for complex workflows |
Note: Use
.si()(signature immutable) to prevent result passing. Use.s()if you need to pass results.
Examples: See assets/celery_patterns.py for chain, group, and combined patterns.
Beat Scheduling (Periodic Tasks)
| Operation | Key Points |
|---|---|
| Create schedule | IntervalSchedule.objects.get_or_create(every=24, period=HOURS) |
| Create periodic task | Use task name (not function), kwargs=json.dumps(...) |
| Delete scheduled task | PeriodicTask.objects.filter(name=...).delete() |
| Avoid race conditions | Use countdown=5 to wait for DB commit |
Examples: See assets/celery_patterns.py for schedule_provider_scan pattern.
Advanced Task Patterns
@set_tenant Behavior
| Mode | tenant_id in kwargs |
tenant_id passed to function |
|---|---|---|
@set_tenant (default) |
Popped (removed) | NO - function doesn't receive it |
@set_tenant(keep_tenant=True) |
Read but kept | YES - function receives it |
Key Patterns
| Pattern | Description |
|---|---|
bind=True |
Access self.request.id, self.request.retries |
get_task_logger(__name__) |
Proper logging in Celery tasks |
SoftTimeLimitExceeded |
Catch to save progress before hard kill |
countdown=30 |
Defer execution by N seconds |
eta=datetime(...) |
Execute at specific time |
Examples: See assets/celery_patterns.py for all advanced patterns.
Celery Configuration
| Setting | Value | Purpose |
|---|---|---|
BROKER_VISIBILITY_TIMEOUT |
86400 (24h) |
Prevent re-queue for long tasks |
CELERY_RESULT_BACKEND |
django-db |
Store results in PostgreSQL |
CELERY_TASK_TRACK_STARTED |
True |
Track when tasks start |
soft_time_limit |
Task-specific | Raises SoftTimeLimitExceeded |
time_limit |
Task-specific | Hard kill (SIGKILL) |
Full config: See assets/celery_patterns.py and actual files at
config/celery.py,config/settings/celery.py.
UUIDv7 for Partitioned Tables
Finding and ResourceFindingMapping use UUIDv7 for time-based partitioning:
from uuid6 import uuid7
from api.uuid_utils import uuid7_start, uuid7_end, datetime_to_uuid7
# Partition-aware filtering
start = uuid7_start(datetime_to_uuid7(date_from))
end = uuid7_end(datetime_to_uuid7(date_to), settings.FINDINGS_TABLE_PARTITION_MONTHS)
queryset.filter(id__gte=start, id__lt=end)
Why UUIDv7? Time-ordered UUIDs enable PostgreSQL to prune partitions during range queries.
Batch Operations with RLS
from api.db_utils import batch_delete, create_objects_in_batches, update_objects_in_batches
# Delete in batches (RLS-aware)
batch_delete(tenant_id, queryset, batch_size=1000)
# Bulk create with RLS
create_objects_in_batches(tenant_id, Finding, objects, batch_size=500)
# Bulk update with RLS
update_objects_in_batches(tenant_id, Finding, objects, fields=["status"], batch_size=500)
Security Patterns
Full examples: See assets/security_patterns.py
Tenant Isolation Summary
| Pattern | Rule |
|---|---|
| RLS in ViewSets | Automatic via BaseRLSViewSet - tenant_id from JWT |
| RLS in Celery | MUST use @set_tenant + rls_transaction(tenant_id) |
| Cross-tenant validation | Defense-in-depth: verify obj.tenant_id == request.tenant_id |
| Never trust user input | Use request.tenant_id from JWT, never request.data.get("tenant_id") |
| Admin DB bypass | Only for cross-tenant admin ops - exposes ALL tenants' data |
Celery Task Security Summary
| Pattern | Rule |
|---|---|
| Named tasks only | NEVER use dynamic task names from user input |
| Validate arguments | Check UUID format before database queries |
| Safe queuing | Use transaction.on_commit() to enqueue AFTER commit |
| Modern retries | Use autoretry_for, retry_backoff, retry_jitter |
| Time limits | Set soft_time_limit and time_limit to prevent hung tasks |
| Idempotency | Use update_or_create or idempotency keys |
Quick Reference
# Safe task queuing - task only enqueued after transaction commits
with transaction.atomic():
provider = Provider.objects.create(**data)
transaction.on_commit(
lambda: verify_provider_connection.delay(
tenant_id=str(request.tenant_id),
provider_id=str(provider.id)
)
)
# Modern retry pattern
@shared_task(
base=RLSTask,
bind=True,
autoretry_for=(ConnectionError, TimeoutError, OperationalError),
retry_backoff=True,
retry_backoff_max=600,
retry_jitter=True,
max_retries=5,
soft_time_limit=300,
time_limit=360,
)
@set_tenant
def sync_provider_data(self, tenant_id, provider_id):
with rls_transaction(tenant_id):
# ... task logic
pass
# Idempotent task - safe to retry
@shared_task(base=RLSTask, acks_late=True)
@set_tenant
def process_finding(tenant_id, finding_uid, data):
with rls_transaction(tenant_id):
Finding.objects.update_or_create(uid=finding_uid, defaults=data)
Production Deployment Checklist
Full settings: See references/production-settings.md
Run before every production deployment:
cd api && poetry run python src/backend/manage.py check --deploy
Critical Settings
| Setting | Production Value | Risk if Wrong |
|---|---|---|
DEBUG |
False |
Exposes stack traces, settings, SQL queries |
SECRET_KEY |
Env var, rotated | Session hijacking, CSRF bypass |
ALLOWED_HOSTS |
Explicit list | Host header attacks |
SECURE_SSL_REDIRECT |
True |
Credentials sent over HTTP |
SESSION_COOKIE_SECURE |
True |
Session cookies over HTTP |
CSRF_COOKIE_SECURE |
True |
CSRF tokens over HTTP |
SECURE_HSTS_SECONDS |
31536000 (1 year) |
Downgrade attacks |
CONN_MAX_AGE |
60 or higher |
Connection pool exhaustion |
Commands
# Development
cd api && poetry run python src/backend/manage.py runserver
cd api && poetry run python src/backend/manage.py shell
# Celery
cd api && poetry run celery -A config.celery worker -l info -Q scans,overview
cd api && poetry run celery -A config.celery beat -l info
# Testing
cd api && poetry run pytest -x --tb=short
# Production checks
cd api && poetry run python src/backend/manage.py check --deploy
Resources
Local References
- File Locations: See references/file-locations.md
- Modeling Decisions: See references/modeling-decisions.md
- Configuration: See references/configuration.md
- Production Settings: See references/production-settings.md
- Security Patterns: See assets/security_patterns.py
Related Skills
- Generic DRF Patterns: Use
django-drfskill - API Testing: Use
prowler-test-apiskill
Context7 MCP (Recommended)
Prerequisite: Install Context7 MCP server for up-to-date documentation lookup.
When implementing or debugging Prowler-specific patterns, query these libraries via mcp_context7_query-docs:
| Library | Context7 ID | Use For |
|---|---|---|
| Celery | /websites/celeryq_dev_en_stable |
Task patterns, queues, error handling |
| django-celery-beat | /celery/django-celery-beat |
Periodic task scheduling |
| Django | /websites/djangoproject_en_5_2 |
Models, ORM, constraints, indexes |
Example queries:
mcp_context7_query-docs(libraryId="/websites/celeryq_dev_en_stable", query="shared_task decorator retry patterns")
mcp_context7_query-docs(libraryId="/celery/django-celery-beat", query="periodic task database scheduler")
mcp_context7_query-docs(libraryId="/websites/djangoproject_en_5_2", query="model constraints CheckConstraint UniqueConstraint")
Note: Use
mcp_context7_resolve-library-idfirst if you need to find the correct library ID.