groq-enterprise-rbac

SKILL.md

Groq Enterprise RBAC

Overview

Manage access to Groq's ultra-fast LPU inference API through API key scoping and organization-level controls. Groq's per-token pricing is extremely low (orders of magnitude cheaper than GPU-based providers), but its speed makes runaway usage easy.

Prerequisites

  • Groq Cloud account at console.groq.com
  • Organization created with billing configured
  • At least one API key with organization admin scope

Instructions

Step 1: Create Rate-Limited API Keys per Team

set -euo pipefail
# Key for the chatbot team (high RPM, small model)
curl -X POST https://api.groq.com/openai/v1/api-keys \
  -H "Authorization: Bearer $GROQ_ADMIN_KEY" \
  -d '{
    "name": "chatbot-prod",
    "allowed_models": ["llama-3.3-70b-versatile", "llama-3.1-8b-instant"],
    "requests_per_minute": 500,  # HTTP 500 Internal Server Error
    "tokens_per_minute": 100000  # 100000 = configured value
  }'

# Key for batch processing (lower RPM but higher token limit)
curl -X POST https://api.groq.com/openai/v1/api-keys \
  -H "Authorization: Bearer $GROQ_ADMIN_KEY" \
  -d '{
    "name": "batch-processor",
    "allowed_models": ["llama-3.1-8b-instant"],
    "requests_per_minute": 60,
    "tokens_per_minute": 500000  # 500000 = configured value
  }'

Step 2: Implement Model-Level Access Control

// groq-gateway.ts - Enforce model restrictions before forwarding
const TEAM_MODEL_ACCESS: Record<string, string[]> = {
  chatbot:    ['llama-3.3-70b-versatile', 'llama-3.1-8b-instant'],
  analytics:  ['llama-3.1-8b-instant'],  // Cheapest model only
  research:   ['llama-3.3-70b-versatile', 'mixtral-8x7b-32768', 'gemma2-9b-it'],  # 32768 = configured value
};

function validateModelAccess(team: string, model: string): boolean {
  return TEAM_MODEL_ACCESS[team]?.includes(model) ?? false;
}

Step 3: Set Organization Spending Limits

In the Groq Console > Organization > Billing:

  • Set monthly spending cap (e.g., $500/month)
  • Configure alerts at $100, $250, $400 thresholds
  • Enable auto-pause when cap is reached (prevents surprise bills)

Step 4: Monitor Key Usage

set -euo pipefail
# Check usage across all API keys
curl https://api.groq.com/openai/v1/usage \
  -H "Authorization: Bearer $GROQ_ADMIN_KEY" | \
  jq '.usage_by_key[] | {key_name, requests_today, tokens_today, estimated_cost_usd}'

Step 5: Rotate Keys with Zero Downtime

set -euo pipefail
# 1. Create replacement key with same permissions
# 2. Deploy new key to services
# 3. Monitor for 24h to confirm no traffic on old key
# 4. Delete old key
curl -X DELETE "https://api.groq.com/openai/v1/api-keys/OLD_KEY_ID" \
  -H "Authorization: Bearer $GROQ_ADMIN_KEY"

Error Handling

Issue Cause Solution
429 rate_limit_exceeded RPM or TPM cap hit Groq rate limits are strict; add exponential backoff
401 invalid_api_key Key deleted or expired Generate new key in Groq Console
model_not_available Model not in key's allowed list Create key with broader model access
Spending cap paused API Monthly budget reached Increase cap or wait for billing cycle

Examples

Basic usage: Apply groq enterprise rbac to a standard project setup with default configuration options.

Advanced scenario: Customize groq enterprise rbac for production environments with multiple constraints and team-specific requirements.

Output

  • Configuration files or code changes applied to the project
  • Validation report confirming correct implementation
  • Summary of changes made and their rationale

Resources

  • Official Groq documentation
  • Community best practices and patterns
  • Related skills in this plugin pack
Weekly Installs
20
GitHub Stars
1.6K
First Seen
Jan 25, 2026
Installed on
codex19
cline18
gemini-cli18
antigravity18
cursor18
opencode18