skills/mukul975/anthropic-cybersecurity-skills/performing-steganography-detection

performing-steganography-detection

SKILL.md

Performing Steganography Detection

When to Use

  • When suspecting covert data hiding in images, audio, or video files
  • During investigations involving suspected data exfiltration via media files
  • For analyzing files in espionage or insider threat investigations
  • When standard file analysis reveals anomalies in media file properties
  • For detecting communication channels using steganographic techniques

Prerequisites

  • StegDetect, zsteg, stegsolve, binwalk for analysis
  • steghide, OpenStego for extraction attempts
  • ExifTool for metadata analysis
  • Python with Pillow, numpy for custom analysis
  • Understanding of common steganographic techniques (LSB, DCT, spread spectrum)
  • Sample files for comparison and statistical analysis

Workflow

Step 1: Initial File Assessment and Metadata Analysis

# Install steganography detection tools
sudo apt-get install steghide stegsnow
pip install zsteg
pip install stegoveritas
gem install zsteg  # Ruby-based tool for PNG/BMP

# Examine file metadata for anomalies
exiftool /cases/case-2024-001/media/suspect_image.jpg | tee /cases/case-2024-001/analysis/metadata.txt

# Check for unusual file size (larger than expected for resolution/format)
identify -verbose /cases/case-2024-001/media/suspect_image.jpg | head -30

# Verify file type matches extension
file /cases/case-2024-001/media/suspect_image.jpg
# Confirm JPEG signature vs actual content

# Check for appended data after file footer
python3 << 'PYEOF'
import os

filepath = '/cases/case-2024-001/media/suspect_image.jpg'
filesize = os.path.getsize(filepath)

with open(filepath, 'rb') as f:
    data = f.read()

# JPEG files end with FF D9
jpeg_end = data.rfind(b'\xff\xd9')
if jpeg_end > 0:
    trailing_bytes = filesize - jpeg_end - 2
    if trailing_bytes > 0:
        print(f"WARNING: {trailing_bytes} bytes of data after JPEG end marker!")
        print(f"  File size: {filesize} bytes")
        print(f"  JPEG data: {jpeg_end + 2} bytes")
        print(f"  Hidden data: {trailing_bytes} bytes")
        # Extract trailing data
        with open('/cases/case-2024-001/analysis/trailing_data.bin', 'wb') as out:
            out.write(data[jpeg_end + 2:])
    else:
        print("No trailing data detected after JPEG end marker")

# Check for embedded ZIP/RAR archives
zip_offset = data.find(b'PK\x03\x04')
rar_offset = data.find(b'Rar!\x1a\x07')
if zip_offset > 0:
    print(f"ZIP archive found at offset {zip_offset}")
if rar_offset > 0:
    print(f"RAR archive found at offset {rar_offset}")
PYEOF

Step 2: Run Automated Steganalysis Tools

# Use binwalk to detect embedded files and data
binwalk /cases/case-2024-001/media/suspect_image.jpg | tee /cases/case-2024-001/analysis/binwalk_scan.txt

# Extract embedded files
binwalk --extract --directory /cases/case-2024-001/analysis/binwalk_extracted/ \
   /cases/case-2024-001/media/suspect_image.jpg

# Use zsteg for PNG and BMP analysis (LSB detection)
zsteg /cases/case-2024-001/media/suspect_image.png | tee /cases/case-2024-001/analysis/zsteg_results.txt

# zsteg with all checks
zsteg -a /cases/case-2024-001/media/suspect_image.png

# Use stegoveritas for comprehensive analysis
stegoveritas /cases/case-2024-001/media/suspect_image.jpg \
   -out /cases/case-2024-001/analysis/stegoveritas/

# Stegoveritas performs:
# - Metadata extraction
# - LSB analysis (multiple bit planes)
# - Color map analysis
# - Trailing data detection
# - Embedded file extraction
# - Image transformation analysis

# Use steghide for JPEG/BMP/WAV/AU extraction attempts
# Try with empty password
steghide extract -sf /cases/case-2024-001/media/suspect_image.jpg -p "" \
   -xf /cases/case-2024-001/analysis/steghide_extract.bin 2>&1

# Try with common passwords
for pwd in password secret hidden stego test 123456 admin; do
    result=$(steghide extract -sf /cases/case-2024-001/media/suspect_image.jpg \
       -p "$pwd" -xf "/cases/case-2024-001/analysis/steghide_$pwd.bin" 2>&1)
    if echo "$result" | grep -q "extracted"; then
        echo "SUCCESS with password: $pwd"
    fi
done

Step 3: Perform LSB (Least Significant Bit) Analysis

# Custom LSB analysis with Python
python3 << 'PYEOF'
from PIL import Image
import numpy as np

img = Image.open('/cases/case-2024-001/media/suspect_image.png')
pixels = np.array(img)

# Extract LSB from each color channel
for channel, name in enumerate(['Red', 'Green', 'Blue']):
    if channel >= pixels.shape[2]:
        break

    lsb_data = pixels[:, :, channel] & 1

    # Count distribution (should be ~50/50 for natural images)
    zeros = np.sum(lsb_data == 0)
    ones = np.sum(lsb_data == 1)
    total = zeros + ones
    ratio = ones / total

    print(f"{name} channel LSB: 0s={zeros} ({zeros/total*100:.1f}%), 1s={ones} ({ones/total*100:.1f}%)")
    if abs(ratio - 0.5) < 0.01:
        print(f"  NEUTRAL - Close to random (could be stego or natural)")
    elif ratio > 0.55 or ratio < 0.45:
        print(f"  ANOMALY - Significant deviation from expected distribution")

# Extract LSB data as bytes
lsb_bits = (pixels[:, :, 0] & 1).flatten()
lsb_bytes = np.packbits(lsb_bits)

# Check if extracted data has structure
with open('/cases/case-2024-001/analysis/lsb_extracted.bin', 'wb') as f:
    f.write(lsb_bytes.tobytes())

# Check for known file signatures in extracted data
import struct
header = bytes(lsb_bytes[:16])
print(f"\nLSB extracted header (hex): {header.hex()}")
if header[:4] == b'PK\x03\x04':
    print("  DETECTED: ZIP archive in LSB data!")
elif header[:3] == b'GIF':
    print("  DETECTED: GIF image in LSB data!")
elif header[:4] == b'\x89PNG':
    print("  DETECTED: PNG image in LSB data!")
elif header[:2] == b'\xff\xd8':
    print("  DETECTED: JPEG image in LSB data!")

# Generate LSB visualization
lsb_img = Image.fromarray((lsb_data * 255).astype(np.uint8))
lsb_img.save('/cases/case-2024-001/analysis/lsb_visualization.png')
print("\nLSB visualization saved to lsb_visualization.png")
PYEOF

Step 4: Analyze Audio and Video Steganography

# Spectral analysis of audio files
python3 << 'PYEOF'
import wave
import numpy as np

# Analyze WAV file for audio steganography
with wave.open('/cases/case-2024-001/media/suspect_audio.wav', 'r') as wav:
    frames = wav.readframes(wav.getnframes())
    samples = np.frombuffer(frames, dtype=np.int16)

    # LSB analysis of audio samples
    lsb = samples & 1
    zeros = np.sum(lsb == 0)
    ones = np.sum(lsb == 1)
    total = len(lsb)

    print(f"Audio LSB Analysis:")
    print(f"  Samples: {total}")
    print(f"  LSB 0s: {zeros} ({zeros/total*100:.1f}%)")
    print(f"  LSB 1s: {ones} ({ones/total*100:.1f}%)")

    # Extract LSB data
    lsb_bytes = np.packbits(lsb)
    with open('/cases/case-2024-001/analysis/audio_lsb.bin', 'wb') as f:
        f.write(lsb_bytes.tobytes())

    # Chi-square test for randomness
    from scipy import stats
    chi2, p_value = stats.chisquare([zeros, ones])
    print(f"  Chi-square: {chi2:.4f}, p-value: {p_value:.4f}")
    if p_value < 0.05:
        print(f"  ANOMALY: LSB distribution is not random (potential stego)")
PYEOF

# Use steghide on audio files
steghide info /cases/case-2024-001/media/suspect_audio.wav

# Analyze with sonic-visualiser or audacity for spectral anomalies
# (Check spectrogram for hidden images encoded in frequency domain)

Step 5: Generate Steganalysis Report

# Compile findings
python3 << 'PYEOF'
import os, json

report = {
    "case": "2024-001",
    "files_analyzed": [],
    "findings": []
}

analysis_dir = '/cases/case-2024-001/analysis/'
for f in os.listdir(analysis_dir):
    if f.endswith('.txt'):
        with open(os.path.join(analysis_dir, f)) as fh:
            content = fh.read()
            if 'DETECTED' in content or 'SUCCESS' in content or 'WARNING' in content:
                report["findings"].append({
                    "source": f,
                    "content": content[:500]
                })

with open('/cases/case-2024-001/analysis/steg_report.json', 'w') as f:
    json.dump(report, f, indent=2)

print("Steganalysis report generated")
print(f"Total findings: {len(report['findings'])}")
PYEOF

Key Concepts

Concept Description
LSB (Least Significant Bit) Embedding data in the lowest-order bits of pixel or sample values
DCT steganography Hiding data in JPEG discrete cosine transform coefficients
Spread spectrum Distributing hidden data across the entire carrier signal
Steganalysis The science of detecting the presence of hidden information
Chi-square attack Statistical test detecting non-random LSB distributions
Cover medium The original file used to carry hidden data (image, audio, video)
Stego medium The resulting file after hidden data has been embedded
Capacity Maximum amount of data that can be hidden without visible distortion

Tools & Systems

Tool Purpose
steghide Embed/extract data in JPEG, BMP, WAV, AU files
zsteg Detect LSB steganography in PNG and BMP files
binwalk Detect embedded files and data within binary files
stegoveritas Comprehensive steganalysis tool with multiple detection methods
StegSolve Java GUI tool for image bit plane and filter analysis
OpenStego Open-source steganography and watermarking tool
ExifTool Metadata extraction and analysis for media files
stegseek Fast steghide password cracker for JPEG stego extraction

Common Scenarios

Scenario 1: Covert Communication Investigation Examine images exchanged between suspects via messaging platforms, run stegoveritas and zsteg on all PNG/BMP files, attempt steghide extraction with known passwords on JPEG files, analyze LSB distributions for statistical anomalies, extract and decode any hidden messages.

Scenario 2: Data Exfiltration via Image Upload Monitor images uploaded to cloud services for unusual file sizes, compare image metadata with expected camera/device profiles, run binwalk to detect embedded archives, analyze JPEG quantization tables for steghide signatures, extract and examine any hidden payloads.

Scenario 3: Malware Command and Control Analyze images downloaded by malware for embedded commands, check for data appended after file end markers, examine DNS query responses for base64-encoded data in TXT records, analyze PNG IDAT chunks for anomalous compressed data sizes.

Scenario 4: Intellectual Property Theft via Audio Files Analyze audio files for embedded documents in LSB, check spectrograms for visual patterns hidden in frequency domain, compare audio file sizes with expected sizes for bitrate and duration, extract and analyze any hidden data payloads.

Output Format

Steganalysis Summary:
  Files Analyzed: 45 (32 images, 8 audio, 5 video)

  Detection Results:
    suspect_image_03.png:
      zsteg: Text detected in R channel LSB
      Content: "Meet at location B, Tuesday 1400"
      Method: LSB embedding in Red channel

    suspect_photo_17.jpg:
      steghide: Data extracted with password "secret123"
      Hidden file: confidential_report.pdf (234 KB)
      Method: DCT coefficient modification

    profile_pic.png:
      binwalk: ZIP archive embedded at offset 45678
      Contents: 3 spreadsheet files with financial data
      Method: Data appended after PNG IEND marker

    recording_05.wav:
      LSB analysis: Non-random distribution (p < 0.001)
      Extracted: 12 KB binary payload (further analysis needed)
      Method: Audio LSB embedding

  Clean Files: 41 (no steganographic indicators)
  Suspicious Files: 4 (data extracted)

  Report: /cases/case-2024-001/analysis/steg_report.json
Weekly Installs
1
GitHub Stars
1.3K
First Seen
2 days ago
Installed on
zencoder1
amp1
cline1
opencode1
cursor1
kimi-cli1