Adobe Observability

Overview

Set up comprehensive observability for Adobe API integrations covering four pillars: metrics (Prometheus), traces (OpenTelemetry), logs (structured JSON), and alerts. Each Adobe API has different latency profiles requiring specific monitoring.

Prerequisites

Prometheus or compatible metrics backend
OpenTelemetry SDK (@opentelemetry/api)
Grafana or similar dashboarding tool
AlertManager or PagerDuty for alerts

Instructions

Step 1: Define Key Metrics by API

Metric	Type	Labels	Description
`adobe_ims_token_requests_total`	Counter	`status`	Token generation attempts
`adobe_api_requests_total`	Counter	`api,operation,status`	API calls by type
`adobe_api_duration_seconds`	Histogram	`api,operation`	Latency per operation
`adobe_api_errors_total`	Counter	`api,error_code`	Errors by code (401,403,429,500)
`adobe_job_poll_count`	Histogram	`api`	Polls before async job completes
`adobe_rate_limit_retries_total`	Counter	`api`	429 retries
`adobe_pdf_transactions_used`	Gauge	—	Monthly PDF Services usage

Step 2: Instrumented Adobe Client

import { Counter, Histogram, Gauge, Registry } from 'prom-client';

const registry = new Registry();

const apiRequests = new Counter({
  name: 'adobe_api_requests_total',
  help: 'Total Adobe API requests',
  labelNames: ['api', 'operation', 'status'] as const,
  registers: [registry],
});

const apiDuration = new Histogram({
  name: 'adobe_api_duration_seconds',
  help: 'Adobe API request duration in seconds',
  labelNames: ['api', 'operation'] as const,
  buckets: [0.5, 1, 2, 5, 10, 20, 30, 60], // Adobe APIs are slow
  registers: [registry],
});

const apiErrors = new Counter({
  name: 'adobe_api_errors_total',
  help: 'Adobe API errors by code',
  labelNames: ['api', 'error_code'] as const,
  registers: [registry],
});

export async function instrumentedAdobeCall<T>(
  api: string,
  operation: string,
  fn: () => Promise<T>
): Promise<T> {
  const timer = apiDuration.startTimer({ api, operation });
  try {
    const result = await fn();
    apiRequests.inc({ api, operation, status: 'success' });
    return result;
  } catch (error: any) {
    const errorCode = error.status || error.httpStatus || 'unknown';
    apiRequests.inc({ api, operation, status: 'error' });
    apiErrors.inc({ api, error_code: String(errorCode) });
    throw error;
  } finally {
    timer();
  }
}

// Usage
const image = await instrumentedAdobeCall('firefly', 'generate', () =>
  generateImage({ prompt: 'sunset landscape' })
);

Step 3: OpenTelemetry Distributed Tracing

import { trace, SpanStatusCode } from '@opentelemetry/api';

const tracer = trace.getTracer('adobe-integration');

export async function tracedAdobeCall<T>(
  api: string,
  operation: string,
  fn: () => Promise<T>
): Promise<T> {
  return tracer.startActiveSpan(`adobe.${api}.${operation}`, async (span) => {
    span.setAttribute('adobe.api', api);
    span.setAttribute('adobe.operation', operation);
    span.setAttribute('adobe.client_id', process.env.ADOBE_CLIENT_ID!);

    try {
      const result = await fn();
      span.setStatus({ code: SpanStatusCode.OK });
      return result;
    } catch (error: any) {
      span.setStatus({ code: SpanStatusCode.ERROR, message: error.message });
      span.setAttribute('adobe.error_code', error.status || 'unknown');
      span.recordException(error);
      throw error;
    } finally {
      span.end();
    }
  });
}

Step 4: Structured Logging

import pino from 'pino';

const logger = pino({
  name: 'adobe',
  level: process.env.LOG_LEVEL || 'info',
  redact: ['clientSecret', 'accessToken', 'req.headers.authorization'],
});

export function logAdobeOperation(entry: {
  api: string;
  operation: string;
  durationMs: number;
  status: 'success' | 'error';
  httpStatus?: number;
  jobId?: string;
  error?: string;
}) {
  if (entry.status === 'error') {
    logger.error(entry, `Adobe ${entry.api}.${entry.operation} failed`);
  } else {
    logger.info(entry, `Adobe ${entry.api}.${entry.operation} completed`);
  }
}

Step 5: Alert Rules

# prometheus/adobe-alerts.yml
groups:
  - name: adobe_alerts
    rules:
      - alert: AdobeAuthFailure
        expr: increase(adobe_api_errors_total{error_code="401"}[5m]) > 0
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Adobe authentication failure — credentials may be expired or revoked"

      - alert: AdobeRateLimited
        expr: rate(adobe_api_errors_total{error_code="429"}[5m]) > 0.1
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Adobe API rate limited — reduce throughput or upgrade tier"

      - alert: AdobeHighLatency
        expr: |
          histogram_quantile(0.95,
            rate(adobe_api_duration_seconds_bucket{api="firefly"}[5m])
          ) > 30
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Adobe Firefly P95 latency > 30s"

      - alert: AdobeApiDown
        expr: |
          rate(adobe_api_errors_total{error_code=~"5.."}[5m]) /
          rate(adobe_api_requests_total[5m]) > 0.1
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Adobe API server error rate > 10%"

      - alert: AdobePdfQuotaLow
        expr: adobe_pdf_transactions_used > 450
        labels:
          severity: warning
        annotations:
          summary: "PDF Services: < 50 free tier transactions remaining"

Metrics Endpoint

app.get('/metrics', async (req, res) => {
  res.set('Content-Type', registry.contentType);
  res.send(await registry.metrics());
});

Output

Prometheus metrics for all Adobe API calls (latency, errors, rate limits)
OpenTelemetry traces with Adobe-specific span attributes
Structured JSON logging with credential redaction
Alert rules for auth failures, rate limiting, latency, and quota

Error Handling

Issue	Cause	Solution
High cardinality metrics	Too many label values	Use fixed set of operation names
Alert storms	Thresholds too sensitive	Increase `for` duration
Missing traces	No OTel propagation	Verify context propagation setup
Redacted data in logs	Over-aggressive redaction	Whitelist safe fields

Resources

Next Steps

For incident response, see adobe-incident-runbook.

adobe-observability