skills/bitwarden/ai-plugins/extracting-session-data

extracting-session-data

SKILL.md

Extracting Session Data Skill

Core Responsibility

Provide raw access to Claude Code session logs stored in ~/.claude/projects/{project-dir}/{session-id}.jsonl.

Key Principle: This skill extracts data only - return raw data to calling skills for analysis. Do not analyze or interpret within this skill.

Available Scripts

All scripts located in scripts/ subdirectory relative to this skill.

1. locate-logs.sh

Find log directory or specific session file path.

# Get logs directory for current working directory
scripts/locate-logs.sh

# Get logs directory for specific project
scripts/locate-logs.sh /path/to/project

# Get specific session log file path
scripts/locate-logs.sh /path/to/project abc123-session-id

Use when: Building dynamic paths, verifying logs exist before processing.

2. list-sessions.sh

Enumerate all sessions with metadata (ID, size, lines, date, branch).

# List all sessions (table format)
scripts/list-sessions.sh

# JSON output
scripts/list-sessions.sh --format json

# Sort by size or lines
scripts/list-sessions.sh --sort size
scripts/list-sessions.sh --sort lines

# Specific project
scripts/list-sessions.sh /path/to/project

Output formats: table, json, csv Sort options: date, size, lines

Use when: Starting retrospective, showing available sessions to user, checking for recent sessions.

3. extract-data.sh

Parse JSONL logs and extract specific data types.

Available extraction types:

  • metadata - Session info (ID, timestamps, branch, working dir)
  • user-prompts - All user messages
  • tool-usage - Tool call statistics
  • errors - Failed tool calls with timestamps
  • thinking - Thinking blocks (if extended thinking enabled)
  • text-responses - Assistant text responses only
  • statistics - Session metrics (message counts, tool calls, errors)
  • all - Combined extraction
# Extract from specific session
scripts/extract-data.sh --type statistics --session SESSION_ID
scripts/extract-data.sh --type errors --session SESSION_ID
scripts/extract-data.sh --type tool-usage --session SESSION_ID

# Extract from all sessions (omit --session)
scripts/extract-data.sh --type statistics

# Limit output
scripts/extract-data.sh --type user-prompts --limit 10

# Different project
scripts/extract-data.sh --type metadata --project /path/to/project

Use when: Need specific data without loading entire log, generating metrics, identifying errors.

4. filter-sessions.sh

Find sessions matching criteria.

Filter options:

  • --since DATE - Sessions modified since date ("2 days ago", "2025-10-20")
  • --until DATE - Sessions modified until date
  • --branch NAME - Sessions on specific git branch
  • --min-size SIZE - Minimum file size ("1M", "500K")
  • --max-size SIZE - Maximum file size
  • --min-lines N - Minimum line count
  • --max-lines N - Maximum line count
  • --has-errors - Only sessions with failed tool calls
  • --keyword WORD - Sessions containing keyword

Output formats: list, paths, json

# Recent sessions
scripts/filter-sessions.sh --since "2 days ago"

# Large sessions with errors
scripts/filter-sessions.sh --min-lines 500 --has-errors

# Sessions on main branch in last week
scripts/filter-sessions.sh --branch main --since "7 days ago"

# Sessions containing keyword
scripts/filter-sessions.sh --keyword "authentication"

# Get paths only (for piping)
scripts/filter-sessions.sh --since "1 day ago" --format paths

Use when: User requests analysis of recent sessions, finding sessions for specific feature/branch, identifying problematic sessions.

Working Process

Single Session Analysis

# 1. Verify session exists and get metadata
scripts/extract-data.sh --type metadata --session SESSION_ID

# 2. Get session statistics (to determine size)
scripts/extract-data.sh --type statistics --session SESSION_ID

# 3. Extract specific data as needed
scripts/extract-data.sh --type errors --session SESSION_ID
scripts/extract-data.sh --type tool-usage --session SESSION_ID

Multiple Session Analysis

# 1. Filter to find relevant sessions
scripts/filter-sessions.sh --since "7 days ago" --branch main

# 2. Extract data from all filtered sessions
scripts/extract-data.sh --type statistics

# 3. Or iterate through filtered subset
SESSIONS=$(scripts/filter-sessions.sh --has-errors --format paths)
for session in $SESSIONS; do
    SESSION_ID=$(basename "$session" .jsonl)
    scripts/extract-data.sh --type errors --session $SESSION_ID
done

Integration Pattern for Calling Skills

When another skill (like retrospecting) needs session data:

  1. Discovery: Use list-sessions.sh or filter-sessions.sh to find relevant sessions
  2. Size Check: Use extract-data.sh --type statistics to determine session complexity
  3. Targeted Extraction: Use extract-data.sh with specific types for needed data
  4. Return Raw Data: Return extracted data to caller for analysis

Example:

# Get latest session ID
LATEST=$(scripts/list-sessions.sh --format json --sort date | jq -r '.[0].sessionId')

# Check size before processing
STATS=$(scripts/extract-data.sh --type statistics --session $LATEST)
LINE_COUNT=$(echo "$STATS" | grep "Total Lines:" | awk '{print $3}')

# Extract based on size
if [ "$LINE_COUNT" -lt 500 ]; then
    # Small session: extract detail
    scripts/extract-data.sh --type errors --session $LATEST
    scripts/extract-data.sh --type tool-usage --session $LATEST
else
    # Large session: summary only
    scripts/extract-data.sh --type statistics --session $LATEST
fi

Context Budget Management

CRITICAL: This skill is designed for context efficiency

Use Bash Processing, Not Read Tool

# GOOD: Extract via bash, stays in bash context
STATS=$(scripts/extract-data.sh --type statistics)
# Process $STATS in bash

# BAD: Reading full log files
Read ~/.claude/projects/-path/session.jsonl
# Loads entire file into context unnecessarily

Check Session Size Before Loading

Never load full session logs into context without checking size first.

# Always check statistics first
scripts/extract-data.sh --type statistics --session SESSION_ID
# Shows total lines, message counts, etc.

# Decision rules:
# - Small (<500 lines): Can extract detail safely
# - Medium (500-2000 lines): Use selective extraction
# - Large (>2000 lines): Statistics only, offer targeted deep-dives

Return Raw Data to Caller

This skill should:

  • Execute bash scripts to extract data
  • Return raw text output to calling skill
  • Let calling skill manage context for analysis
  • Avoid interpretation or analysis within this skill

Output Format

Return raw extracted data with minimal formatting:

# Statistics output
Session: abc123-def456-ghi789
  Total Lines: 450
  User Messages: 12
  Assistant Messages: 23
  Tool Calls: 45
  Errors: 2

# Tool usage output
=== Tool Usage: abc123-def456-ghi789 ===
Read                          15
Bash                          12
Edit                          8
Grep                          5
Write                         3

No analysis, no interpretation - just data extraction.

Error Handling

All scripts exit with non-zero status on errors and output to stderr.

Check exit status before processing:

if ! scripts/locate-logs.sh /path/to/project &>/dev/null; then
    # Handle: logs directory doesn't exist
    echo "Project has no session logs yet"
fi

if ! scripts/extract-data.sh --type metadata --session abc123 &>/dev/null; then
    # Handle: session doesn't exist
    echo "Session not found"
fi

Common error messages:

  • Error: Logs directory not found: ~/.claude/projects/-path
  • Error: Session file not found: ~/.claude/projects/-path/session-id.jsonl
  • Error: --type is required
  • Error: jq is required but not installed. Install with: brew install jq

Path Calculation

Claude Code stores sessions using this pattern:

~/.claude/projects/{project-identifier}/{session-id}.jsonl

Where {project-identifier} is calculated by replacing all / with - in the absolute working directory path:

# Example: /Users/user/project → -Users-user-project
PROJECT_ID=$(echo "${PWD}" | sed 's/\//\-/g')
LOGS_DIR="${HOME}/.claude/projects/${PROJECT_ID}"

All scripts use locate-logs.sh internally for consistent path calculation.

Anti-Patterns to Avoid

Don't:

  • Load full session logs into context without checking size
  • Parse JSONL manually - use extract-data.sh
  • Hardcode log paths - use locate-logs.sh
  • Analyze or interpret data - return raw data to caller
  • Process large logs synchronously without user awareness

Do:

  • Check session size with --type statistics before processing
  • Use appropriate extraction type for specific needs
  • Filter sessions before extraction for efficiency
  • Stream/pipe data when processing multiple sessions
  • Return raw data for caller to analyze

Success Criteria

Effective use of this skill means:

  1. Efficient Discovery: Quickly find relevant sessions without manual searching
  2. Targeted Extraction: Get exactly the data needed, nothing more
  3. Context Preservation: Avoid loading unnecessary data into context
  4. Raw Data Focus: Return unprocessed data for caller to analyze
  5. Multi-Session Support: Handle analysis across timeframes or branches efficiently

Dependencies

Required:

  • bash (v4.0+)
  • jq (JSON parser)

Scripts check for jq and provide installation instructions if missing:

Error: jq is required but not installed. Install with: brew install jq
Weekly Installs
10
GitHub Stars
3
First Seen
Feb 13, 2026
Installed on
opencode10
gemini-cli10
claude-code10
mcpjam9
openhands9
zencoder9