incident-response
Installation
SKILL.md
Incident Response
Handle security incidents effectively with structured response procedures.
When to Use This Skill
Use this skill when:
- Responding to an active security incident (breach, malware, unauthorized access)
- Building incident response playbooks and runbooks
- Conducting IR tabletop exercises and drills
- Setting up evidence collection and forensic capabilities
- Establishing communication protocols for security events
- Performing post-incident reviews and process improvements
Prerequisites
- IR team roster with on-call rotation and escalation paths
- Secure communication channel (separate from production systems)
- Forensic workstation with analysis tools installed
- Evidence storage with chain-of-custody controls
- Legal counsel contact information
- Pre-authorized incident response actions documented
Incident Response Phases
phases:
1_preparation:
- IR team roster and 24/7 contact info
- Tools and privileged access ready
- Playbooks documented and tested
- Evidence collection kit prepared
- Communication templates drafted
2_detection:
- Alert triage and validation
- Initial assessment and scoping
- Severity classification
- Incident ticket creation
3_containment:
- Short-term containment (stop bleeding)
- Evidence preservation (before changes)
- System isolation (network/host level)
- Credential rotation if needed
4_eradication:
- Root cause analysis
- Remove threat actor access
- Patch exploited vulnerabilities
- Clean compromised systems
5_recovery:
- System restoration from clean backups
- Enhanced monitoring deployment
- Phased return to production
- Business continuity verification
6_lessons_learned:
- Post-incident review (within 72 hours)
- Timeline reconstruction
- Documentation update
- Process and detection improvements
Severity Classification
| Level | Impact | Response Time | Examples |
|---|---|---|---|
| Critical (P1) | Active data breach, full outage, ransomware | Immediate (< 15 min) | Data exfiltration in progress, ransomware spreading |
| High (P2) | Service degraded, potential breach | < 1 hour | Unauthorized admin access, malware detected |
| Medium (P3) | Limited impact, contained | < 4 hours | Phishing compromise (single user), policy violation |
| Low (P4) | Minimal impact | Next business day | Failed brute force, blocked scanning activity |
Evidence Collection Scripts
Linux Evidence Collection
#!/bin/bash
# linux-evidence-collect.sh - Collect forensic evidence from a Linux host
# Run with sudo. Preserves evidence with timestamps and hashes.
set -euo pipefail
EVIDENCE_DIR="/evidence/$(hostname)-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$EVIDENCE_DIR"
LOGFILE="$EVIDENCE_DIR/collection.log"
log() { echo "[$(date -u +%Y-%m-%dT%H:%M:%SZ)] $*" | tee -a "$LOGFILE"; }
log "Starting evidence collection on $(hostname)"
log "Collector: $(whoami)"
log "System time: $(date -u)"
# System information
log "Collecting system information..."
uname -a > "$EVIDENCE_DIR/uname.txt"
cat /etc/os-release > "$EVIDENCE_DIR/os-release.txt"
uptime > "$EVIDENCE_DIR/uptime.txt"
date -u > "$EVIDENCE_DIR/system-time.txt"
# Running processes (full command line)
log "Collecting process list..."
ps auxwwf > "$EVIDENCE_DIR/processes.txt"
ps -eo pid,ppid,user,args --sort=-pcpu > "$EVIDENCE_DIR/processes-by-cpu.txt"
# Network connections
log "Collecting network state..."
ss -tulnp > "$EVIDENCE_DIR/listening-ports.txt"
ss -anp > "$EVIDENCE_DIR/all-connections.txt"
ip addr show > "$EVIDENCE_DIR/ip-addresses.txt"
ip route show > "$EVIDENCE_DIR/routes.txt"
iptables -L -n -v > "$EVIDENCE_DIR/iptables.txt" 2>&1 || true
cat /etc/resolv.conf > "$EVIDENCE_DIR/dns-config.txt"
# User activity
log "Collecting user activity..."
last -a > "$EVIDENCE_DIR/login-history.txt"
lastb > "$EVIDENCE_DIR/failed-logins.txt" 2>&1 || true
who > "$EVIDENCE_DIR/currently-logged-in.txt"
w > "$EVIDENCE_DIR/user-activity.txt"
cat /etc/passwd > "$EVIDENCE_DIR/passwd.txt"
cat /etc/shadow > "$EVIDENCE_DIR/shadow.txt" 2>/dev/null || true
cat /etc/group > "$EVIDENCE_DIR/group.txt"
# Scheduled tasks
log "Collecting scheduled tasks..."
for user in $(cut -f1 -d: /etc/passwd); do
crontab -u "$user" -l 2>/dev/null >> "$EVIDENCE_DIR/crontabs.txt" && \
echo "--- $user ---" >> "$EVIDENCE_DIR/crontabs.txt"
done
ls -la /etc/cron.* > "$EVIDENCE_DIR/cron-dirs.txt" 2>&1
# File system state
log "Collecting filesystem state..."
find /tmp /var/tmp /dev/shm -type f -ls > "$EVIDENCE_DIR/temp-files.txt" 2>/dev/null
find / -name "*.sh" -mtime -7 -ls > "$EVIDENCE_DIR/recent-scripts.txt" 2>/dev/null
find / -perm -4000 -type f -ls > "$EVIDENCE_DIR/suid-files.txt" 2>/dev/null
find /home -name ".*history" -ls > "$EVIDENCE_DIR/history-files.txt" 2>/dev/null
# Loaded kernel modules
log "Collecting kernel modules..."
lsmod > "$EVIDENCE_DIR/kernel-modules.txt"
# Open files
log "Collecting open files..."
lsof -n > "$EVIDENCE_DIR/open-files.txt" 2>/dev/null
# Systemd services
log "Collecting service state..."
systemctl list-units --type=service --all > "$EVIDENCE_DIR/services.txt"
systemctl list-timers --all > "$EVIDENCE_DIR/timers.txt"
# Log preservation
log "Preserving system logs..."
tar czf "$EVIDENCE_DIR/var-log.tar.gz" /var/log/ 2>/dev/null
# Docker containers (if present)
if command -v docker &>/dev/null; then
log "Collecting Docker state..."
docker ps -a > "$EVIDENCE_DIR/docker-containers.txt"
docker images > "$EVIDENCE_DIR/docker-images.txt"
docker network ls > "$EVIDENCE_DIR/docker-networks.txt"
fi
# Kubernetes (if kubectl available)
if command -v kubectl &>/dev/null; then
log "Collecting Kubernetes state..."
kubectl get pods --all-namespaces > "$EVIDENCE_DIR/k8s-pods.txt" 2>/dev/null
kubectl get events --all-namespaces --sort-by=.lastTimestamp > "$EVIDENCE_DIR/k8s-events.txt" 2>/dev/null
fi
# Hash all evidence files
log "Computing evidence hashes..."
find "$EVIDENCE_DIR" -type f ! -name "checksums.sha256" -exec sha256sum {} \; > "$EVIDENCE_DIR/checksums.sha256"
log "Evidence collection complete: $EVIDENCE_DIR"
echo "Total files collected: $(find "$EVIDENCE_DIR" -type f | wc -l)"
Memory Acquisition
#!/bin/bash
# memory-capture.sh - Capture volatile memory for forensic analysis
EVIDENCE_DIR="/evidence/memory-$(hostname)-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$EVIDENCE_DIR"
# Using LiME (Linux Memory Extractor)
if [ -f /lib/modules/$(uname -r)/extra/lime.ko ]; then
insmod /lib/modules/$(uname -r)/extra/lime.ko "path=$EVIDENCE_DIR/memory.lime format=lime"
echo "Memory captured with LiME"
fi
# Alternative: /proc/kcore (partial, but always available)
cp /proc/kcore "$EVIDENCE_DIR/kcore" 2>/dev/null
# Capture /proc/meminfo for context
cat /proc/meminfo > "$EVIDENCE_DIR/meminfo.txt"
# Hash the memory dump
sha256sum "$EVIDENCE_DIR"/* > "$EVIDENCE_DIR/checksums.sha256"
AWS Evidence Collection
#!/bin/bash
# aws-evidence-collect.sh - Collect evidence from compromised AWS resources
INCIDENT_ID="${1:?Usage: $0 <incident-id>}"
INSTANCE_ID="${2:?Usage: $0 <incident-id> <instance-id>}"
EVIDENCE_BUCKET="s3://incident-evidence-${AWS_ACCOUNT_ID}"
EVIDENCE_PREFIX="${INCIDENT_ID}/$(date +%Y%m%d-%H%M%S)"
echo "=== AWS Evidence Collection ==="
echo "Incident: $INCIDENT_ID"
echo "Instance: $INSTANCE_ID"
# Snapshot EBS volumes
echo "Creating EBS snapshots..."
VOLUMES=$(aws ec2 describe-volumes \
--filters "Name=attachment.instance-id,Values=${INSTANCE_ID}" \
--query 'Volumes[].VolumeId' --output text)
for vol in $VOLUMES; do
SNAP_ID=$(aws ec2 create-snapshot \
--volume-id "$vol" \
--description "IR Evidence - ${INCIDENT_ID} - ${vol}" \
--tag-specifications "ResourceType=snapshot,Tags=[{Key=IncidentId,Value=${INCIDENT_ID}},{Key=Purpose,Value=forensic-evidence}]" \
--query 'SnapshotId' --output text)
echo " Snapshot created: $SNAP_ID for volume $vol"
done
# Capture instance metadata
echo "Capturing instance metadata..."
aws ec2 describe-instances --instance-ids "$INSTANCE_ID" \
> "/tmp/${INCIDENT_ID}-instance-describe.json"
aws s3 cp "/tmp/${INCIDENT_ID}-instance-describe.json" \
"${EVIDENCE_BUCKET}/${EVIDENCE_PREFIX}/instance-describe.json"
# Capture security group rules
SG_IDS=$(aws ec2 describe-instances --instance-ids "$INSTANCE_ID" \
--query 'Reservations[].Instances[].SecurityGroups[].GroupId' --output text)
for sg in $SG_IDS; do
aws ec2 describe-security-group-rules --filters "Name=group-id,Values=${sg}" \
> "/tmp/${INCIDENT_ID}-sg-${sg}.json"
aws s3 cp "/tmp/${INCIDENT_ID}-sg-${sg}.json" \
"${EVIDENCE_BUCKET}/${EVIDENCE_PREFIX}/sg-${sg}.json"
done
# Collect CloudTrail events for the instance
echo "Collecting CloudTrail events..."
aws cloudtrail lookup-events \
--lookup-attributes "AttributeKey=ResourceName,AttributeValue=${INSTANCE_ID}" \
--start-time "$(date -d '7 days ago' -u +%Y-%m-%dT%H:%M:%SZ)" \
> "/tmp/${INCIDENT_ID}-cloudtrail.json"
aws s3 cp "/tmp/${INCIDENT_ID}-cloudtrail.json" \
"${EVIDENCE_BUCKET}/${EVIDENCE_PREFIX}/cloudtrail.json"
# Collect VPC flow logs
echo "Collecting VPC flow logs..."
ENI_ID=$(aws ec2 describe-instances --instance-ids "$INSTANCE_ID" \
--query 'Reservations[].Instances[].NetworkInterfaces[0].NetworkInterfaceId' --output text)
aws ec2 describe-flow-logs --filter "Name=resource-id,Values=${ENI_ID}" \
> "/tmp/${INCIDENT_ID}-flow-logs.json"
aws s3 cp "/tmp/${INCIDENT_ID}-flow-logs.json" \
"${EVIDENCE_BUCKET}/${EVIDENCE_PREFIX}/flow-logs-config.json"
# Isolate the instance (move to quarantine security group)
echo "Isolating instance..."
QUARANTINE_SG=$(aws ec2 create-security-group \
--group-name "quarantine-${INCIDENT_ID}" \
--description "Quarantine SG for incident ${INCIDENT_ID}" \
--vpc-id "$(aws ec2 describe-instances --instance-ids "$INSTANCE_ID" \
--query 'Reservations[].Instances[].VpcId' --output text)" \
--query 'GroupId' --output text)
# Quarantine SG: deny all inbound, allow outbound only to evidence bucket
aws ec2 modify-instance-attribute \
--instance-id "$INSTANCE_ID" \
--groups "$QUARANTINE_SG"
echo "Instance isolated with quarantine SG: $QUARANTINE_SG"
echo "Evidence stored at: ${EVIDENCE_BUCKET}/${EVIDENCE_PREFIX}/"
Forensics Commands Reference
# --- Disk forensics ---
# Create forensic image of a disk
dd if=/dev/sda of=/evidence/disk.img bs=4M status=progress
sha256sum /evidence/disk.img > /evidence/disk.img.sha256
# Mount forensic image read-only
mount -o ro,loop,noexec /evidence/disk.img /mnt/forensic
# Find recently modified files
find /mnt/forensic -type f -mtime -3 -ls | sort -k11
# Find files by owner
find /mnt/forensic -user www-data -type f -newer /tmp/reference-time -ls
# --- Log analysis ---
# Search auth logs for brute force
grep "Failed password" /var/log/auth.log | awk '{print $11}' | sort | uniq -c | sort -rn | head -20
# Search for privilege escalation
grep -E "(sudo|su\[)" /var/log/auth.log | grep -v "session opened"
# Search web logs for attack patterns
grep -iE "(union.*select|<script|\.\.\/|%00)" /var/log/nginx/access.log
# Timeline analysis with find
find / -newermt "2025-01-15 00:00" ! -newermt "2025-01-16 00:00" -ls 2>/dev/null | sort -k9
# --- Network forensics ---
# Capture network traffic
tcpdump -i eth0 -w /evidence/capture.pcap -c 100000
# Analyze pcap for suspicious connections
tcpdump -r /evidence/capture.pcap -nn 'dst port 4444 or dst port 8888 or dst port 1337'
# Check for DNS tunneling
tcpdump -r /evidence/capture.pcap -nn 'udp port 53' | awk '{print $NF}' | sort | uniq -c | sort -rn | head -20
# --- Malware analysis ---
# Check file for known malware hashes
sha256sum suspicious_file
# Compare against VirusTotal: https://www.virustotal.com
# Strings analysis
strings suspicious_file | grep -iE "(http|ftp|ssh|password|key|token)"
# Check for packed/obfuscated binaries
file suspicious_file
readelf -h suspicious_file 2>/dev/null
Communication Templates
Initial Notification (Internal)
## Security Incident Notification
**Incident ID:** INC-YYYY-NNNN
**Severity:** [Critical/High/Medium/Low]
**Status:** Active - Investigating
**Time Detected:** YYYY-MM-DD HH:MM UTC
**Reported By:** [Name/System]
### Summary
[1-2 sentence description of what was detected]
### Impact Assessment
- **Systems affected:** [list]
- **Data at risk:** [type and scope]
- **Users impacted:** [count/scope]
- **Business impact:** [description]
### Current Actions
- [ ] Evidence preservation in progress
- [ ] Containment measures being applied
- [ ] IR team assembled
### Next Update
Expected at: YYYY-MM-DD HH:MM UTC
### Incident Commander
[Name] - [Contact info]
Stakeholder Update
## Incident Update - INC-YYYY-NNNN
**Update #:** N
**Time:** YYYY-MM-DD HH:MM UTC
**Severity:** [unchanged/upgraded/downgraded]
**Status:** [Investigating/Contained/Eradicating/Recovering/Resolved]
### Progress Since Last Update
- [Bullet points of actions taken]
### Current Understanding
- **Root cause:** [Known/Under investigation]
- **Scope:** [Expanded/Unchanged/Reduced]
- **Threat actor:** [If applicable]
### Active Containment Measures
- [List of measures in place]
### Next Steps
- [Planned actions with ETA]
### Decisions Needed
- [If any decisions required from leadership]
External Breach Notification (if required)
## Notice of Data Security Incident
Dear [Customer/Partner],
We are writing to inform you of a security incident that we detected on
[date]. Upon discovery, we immediately activated our incident response
procedures and engaged external cybersecurity experts.
### What Happened
[Brief, factual description]
### What Information Was Involved
[Types of data affected]
### What We Are Doing
[Remediation steps taken and planned]
### What You Can Do
[Recommended actions for affected parties]
### Contact Information
For questions, please contact: [dedicated contact/hotline]
[Company Name]
[Date]
IR Playbook: Compromised Credentials
playbook: compromised-credentials
trigger: "Alert indicating credential theft, brute force success, or credential dump"
steps:
1_validate:
- Confirm the alert is not a false positive
- Identify which credentials are compromised
- Determine scope (single user, service account, API key)
2_contain:
- Disable compromised accounts immediately
- Revoke active sessions and tokens
- Rotate API keys and service account credentials
- Block source IP if identified
commands:
- "aws iam update-login-profile --user-name USER --password-reset-required"
- "aws iam delete-access-key --user-name USER --access-key-id AKIAXXXX"
- "aws iam deactivate-mfa-device --user-name USER --serial-number ARN"
- "kubectl delete secret compromised-secret -n NAMESPACE"
3_investigate:
- Review CloudTrail/audit logs for the compromised identity
- Identify all actions taken with compromised credentials
- Check for persistence (new keys, roles, backdoors)
- Determine initial compromise vector (phishing, leak, breach)
4_eradicate:
- Remove any backdoors or persistence mechanisms
- Rotate all credentials that may have been exposed
- Update access policies to enforce MFA
- Patch credential storage if vault/secret manager was compromised
5_recover:
- Issue new credentials with MFA enforced
- Restore access with least-privilege review
- Monitor new credentials for abnormal usage
6_improve:
- Add detection for initial compromise vector
- Review credential management policies
- Update security awareness training if phishing was involved
Troubleshooting
| Problem | Cause | Solution |
|---|---|---|
| Evidence collection script fails | Insufficient permissions | Run with sudo/root; pre-authorize IR accounts |
| Cannot access compromised system | System encrypted by ransomware | Use offline disk imaging; restore from backups |
| Logs are missing or tampered | Attacker cleared logs | Check centralized log aggregator; restore from log backups |
| Cannot determine incident scope | Insufficient logging | Enable CloudTrail, VPC flow logs, audit logging for future |
| Stakeholders demanding immediate answers | Pressure to resolve quickly | Follow IR process; provide regular updates; avoid speculation |
| False positive triggered full IR | Detection rules too sensitive | Tune alerting thresholds; add validation step before escalation |
| Evidence integrity questioned | No chain of custody | Hash all evidence immediately; document who accessed what and when |
Best Practices
- Pre-define and practice playbooks with tabletop exercises quarterly
- Maintain separate, secure communication channels for IR (not email or Slack on corporate infra)
- Always preserve evidence before making changes to compromised systems
- Establish chain of custody for all collected evidence
- Engage legal counsel early in any potential data breach
- Conduct blameless post-incident reviews within 72 hours
- Update detection rules and playbooks based on lessons learned
- Pre-authorize common IR actions so responders can act without delay
- Keep an IR "go bag" with tools, credentials, and documentation ready
- Test backup restoration procedures regularly (not just backup creation)
Related Skills
- audit-logging - Log analysis
- alerting-oncall - Alert management
- security-automation - Automated response workflows
- threat-modeling - Proactive threat identification
Weekly Installs
35
Repository
bagelhole/devop…t-skillsGitHub Stars
18
First Seen
4 days ago
Security Audits