evernote-incident-runbook

Installation
SKILL.md

Evernote Incident Runbook

Overview

Step-by-step procedures for responding to Evernote integration incidents including API outages, rate limit escalations, authentication failures, data sync issues, and quota exhaustion.

Prerequisites

  • Access to monitoring dashboards and production logs
  • Production Evernote API credentials
  • Communication channels for escalation (Slack, PagerDuty)

Instructions

Incident Classification

Severity Symptoms Response Time
P1 - Critical All Evernote API calls failing, data loss risk 15 minutes
P2 - High Persistent rate limits, auth failures for multiple users 1 hour
P3 - Medium Intermittent errors, degraded sync performance 4 hours
P4 - Low Single user issues, non-critical feature affected Next business day

Step 1: Triage

Check Evernote's status page first. If Evernote is down, activate the circuit breaker and wait.

# Check Evernote service status
curl -sf https://status.evernote.com/api/v2/status.json | jq '.status'

# Check your API connectivity
curl -sf -H "Authorization: Bearer $EVERNOTE_TOKEN" \
  https://www.evernote.com/shard/s1/notestore | head -20

# Check error rate in logs (last 15 min)
grep -c 'EDAMSystemException' /var/log/evernote-app.log

Step 2: Rate Limit Escalation

If rate limits are persistent: reduce API call frequency, increase delays between batch operations, and contact Evernote developer support for a rate limit increase.

Step 3: Authentication Failure

For auth failures: verify tokens are not expired (edam_expires), check that production credentials match the production endpoint (sandbox: false), and test with a fresh Developer Token to isolate the issue.

Step 4: Sync Failure Recovery

For sync issues: compare local USN with server USN via getSyncState(). If gap is too large, reset to full sync from USN 0. Verify data integrity after re-sync.

Step 5: Mitigation Strategies

  • Circuit breaker: Disable Evernote API calls after N consecutive failures. Retry after cooldown period.
  • Graceful degradation: Serve cached data when API is unavailable. Queue writes for retry.
  • Failover: Switch to polling-based sync if webhooks stop arriving.

Post-Incident

  • Document root cause and timeline
  • Update runbook with new failure modes discovered
  • Adjust alert thresholds if false positive or missed detection
  • Review and improve circuit breaker settings

For the complete diagnostic scripts, mitigation implementations, and communication templates, see Implementation Guide.

Output

  • Incident severity classification table
  • Triage diagnostic commands for quick assessment
  • Rate limit, auth, and sync failure response procedures
  • Circuit breaker and graceful degradation patterns
  • Post-incident review checklist

Error Handling

Incident Type Diagnostic Mitigation
API outage Check status.evernote.com Activate circuit breaker, serve cached data
Rate limit storm Check evernote_rate_limits_total metric Reduce batch sizes, increase delays
Mass auth failure Verify token expiration dates in DB Trigger re-auth flow for affected users
Sync data loss Compare local vs server note counts Full re-sync from USN 0

Resources

Next Steps

For data handling best practices, see evernote-data-handling.

Examples

API outage response: Alert fires, on-call checks status page, confirms Evernote outage, activates circuit breaker, posts status update to internal Slack, monitors for recovery, then gradually re-enables API calls.

Rate limit recovery: Persistent RATE_LIMIT_REACHED errors detected. Reduce batch size from 100 to 10, increase delay to 500ms, clear the request queue, and contact Evernote support if limits continue after 1 hour.

Weekly Installs
1
GitHub Stars
2.0K
First Seen
Apr 4, 2026