Continuous Learning

Part of Agent Skills™ by googleadsagent.ai™

Description

Continuous Learning enables agents to automatically extract successful patterns from completed sessions and codify them into reusable skills, rules, and prompt refinements. Rather than relying on manual skill authoring, a Continuous Learning system treats every agent session as a potential source of new capability. When the agent discovers an effective approach, solves a novel problem, or recovers from an error in a replicable way, the system captures that behavior and integrates it into the agent's skill repertoire.

This skill encodes the learning flywheel built into Buddy™ at googleadsagent.ai™, where cross-session pattern mining has generated dozens of specialized Google Ads analysis techniques that no human engineer explicitly programmed. The system observes which tool sequences produce high-quality outcomes, which prompt modifications improve accuracy, and which error recovery strategies succeed — then packages these observations into structured skills that future sessions can leverage.

The learning pipeline operates in four stages: observation (logging session events with outcome annotations), mining (identifying statistically significant patterns across sessions), validation (testing candidate skills against held-out sessions), and integration (deploying validated skills into the agent's active skill set). This mirrors the scientific method applied to agent behavior: observe, hypothesize, test, deploy.

Use When

You want agent capabilities to improve automatically over time without manual intervention
The agent performs repetitive domain-specific tasks where patterns emerge across sessions
New team members need to benefit from patterns discovered by experienced users
You need to maintain a living knowledge base that reflects actual best practices
A/B testing different agent approaches and promoting winners automatically
Reducing reliance on manual prompt engineering by automating skill derivation

How It Works

graph TD
    A[Session Completion] --> B[Event Logger]
    B --> C[Outcome Annotation]
    C --> D[Pattern Mining Engine]
    D --> E{Significant Pattern?}
    E -->|Yes| F[Skill Template Generator]
    E -->|No| G[Archive for Future Mining]
    F --> H[Candidate Skill]
    H --> I[Validation Suite]
    I -->|Pass| J[Version + Deploy]
    I -->|Fail| K[Refine Hypothesis]
    K --> D
    J --> L[Active Skill Set]
    L --> M[A/B Test Monitor]
    M -->|Winner| N[Promote to Default]
    M -->|Loser| O[Deprecate]

The learning loop begins after each session completes. The event logger captures the full execution trace with outcome annotations (success, partial, failure, user satisfaction signals). The pattern mining engine runs periodically across accumulated sessions, searching for statistically significant correlations between agent behaviors and positive outcomes. Candidate patterns are transformed into skill templates — structured SKILL.md files with instructions, examples, and constraints. Validation runs the candidate skill against held-out sessions to verify it improves outcomes. Validated skills are versioned and deployed, with A/B testing monitoring comparative performance.

Implementation

Outcome-Annotated Event Logger:

@dataclass
class AnnotatedEvent:
    session_id: str
    event_type: str
    content: dict
    outcome_score: float  # 0.0 = failure, 1.0 = success
    user_feedback: str | None = None

class SessionLogger:
    def __init__(self, store):
        self.store = store
        self.buffer = []

    def log(self, event: AnnotatedEvent):
        self.buffer.append(event)

    async def flush(self, session_outcome: float):
        for event in self.buffer:
            if event.outcome_score == 0.0:
                event.outcome_score = session_outcome
            await self.store.append(event.session_id, asdict(event))
        self.buffer.clear()

Pattern Mining Engine:

class PatternMiner:
    def __init__(self, min_occurrences=5, min_success_rate=0.8):
        self.min_occurrences = min_occurrences
        self.min_success_rate = min_success_rate

    async def mine(self, sessions: list[list[AnnotatedEvent]]) -> list[dict]:
        tool_sequences = self.extract_tool_sequences(sessions)
        prompt_patterns = self.extract_prompt_patterns(sessions)
        recovery_strategies = self.extract_recovery_patterns(sessions)

        candidates = []
        for pattern_set in [tool_sequences, prompt_patterns, recovery_strategies]:
            for pattern in pattern_set:
                if (pattern["occurrences"] >= self.min_occurrences
                        and pattern["success_rate"] >= self.min_success_rate):
                    candidates.append(pattern)

        return sorted(candidates, key=lambda p: p["success_rate"] * p["occurrences"], reverse=True)

    def extract_tool_sequences(self, sessions):
        sequences = {}
        for session in sessions:
            tools = [e for e in session if e.event_type == "tool_call"]
            for window in range(2, 5):
                for i in range(len(tools) - window + 1):
                    seq = tuple(t.content.get("name", "") for t in tools[i:i+window])
                    outcome = sum(t.outcome_score for t in tools[i:i+window]) / window
                    if seq not in sequences:
                        sequences[seq] = {"occurrences": 0, "success_sum": 0}
                    sequences[seq]["occurrences"] += 1
                    sequences[seq]["success_sum"] += outcome

        return [
            {"type": "tool_sequence", "pattern": seq, "occurrences": s["occurrences"],
             "success_rate": s["success_sum"] / s["occurrences"]}
            for seq, s in sequences.items()
        ]

Skill Template Generator:

SKILL_TEMPLATE = """# {name}

Part of [Agent Skills™](https://github.com/itallstartedwithaidea/agent-skills) by [googleadsagent.ai™](https://googleadsagent.ai)

<!-- AUTO-GENERATED from pattern mining | v{version} | confidence: {confidence:.0%} -->

## Description
{description}

## Use When
{use_when}

## Implementation
{implementation}

## Derived From
- Sessions analyzed: {sessions_analyzed}
- Pattern frequency: {frequency}
- Success rate: {success_rate:.0%}
- Generated: {generated_date}
"""

class SkillGenerator:
    def __init__(self, model):
        self.model = model

    async def generate_skill(self, pattern: dict, examples: list) -> str:
        prompt = f"""Generate a SKILL.md for the following discovered pattern:

Pattern type: {pattern['type']}
Pattern: {pattern['pattern']}
Success rate: {pattern['success_rate']:.0%}
Occurrences: {pattern['occurrences']}

Example sessions where this pattern succeeded:
{json.dumps(examples[:3], indent=2)}

Write a clear, actionable description, use-when conditions, and implementation guidance."""

        content = await self.model.generate(prompt)
        return SKILL_TEMPLATE.format(
            name=pattern.get("name", "Discovered Pattern"),
            version="1.0.0",
            confidence=pattern["success_rate"],
            description=content,
            use_when="(see description)",
            implementation="(see description)",
            sessions_analyzed=pattern.get("sessions_analyzed", "N/A"),
            frequency=pattern["occurrences"],
            success_rate=pattern["success_rate"],
            generated_date=datetime.utcnow().isoformat(),
        )

A/B Testing Framework:

interface SkillVariant {
  id: string;
  skillContent: string;
  version: string;
  metrics: { invocations: number; successes: number; avgScore: number };
}

class SkillABTest {
  constructor(
    private control: SkillVariant,
    private treatment: SkillVariant,
    private minSamples: number = 50,
  ) {}

  selectVariant(sessionId: string): SkillVariant {
    const hash = simpleHash(sessionId);
    return hash % 2 === 0 ? this.control : this.treatment;
  }

  recordOutcome(variant: SkillVariant, score: number): void {
    variant.metrics.invocations++;
    variant.metrics.successes += score >= 0.8 ? 1 : 0;
    variant.metrics.avgScore =
      (variant.metrics.avgScore * (variant.metrics.invocations - 1) + score) /
      variant.metrics.invocations;
  }

  getWinner(): SkillVariant | null {
    if (this.control.metrics.invocations < this.minSamples) return null;
    if (this.treatment.metrics.invocations < this.minSamples) return null;

    const controlRate = this.control.metrics.successes / this.control.metrics.invocations;
    const treatRate = this.treatment.metrics.successes / this.treatment.metrics.invocations;
    const diff = Math.abs(controlRate - treatRate);

    if (diff > 0.05) {
      return controlRate > treatRate ? this.control : this.treatment;
    }
    return null;
  }
}

Best Practices

Annotate outcomes, not just events — raw event logs without success/failure labels are useless for mining; attach outcome scores to every significant event.
Require statistical significance — a pattern observed 3 times is anecdotal; require at least 5-10 occurrences with consistent success rates before promoting to candidate skill.
Validate on held-out data — never test a skill only on the sessions it was derived from; use held-out sessions to verify generalization.
Version every generated skill — auto-generated skills evolve as new data arrives; version them to track improvements and enable rollback.
A/B test before full deployment — run new skills against the current default on a subset of sessions; promote only after demonstrable improvement.
Cap auto-generated skill count — too many overlapping skills confuse the agent; limit the active set and deprecate underperformers.
Human review for high-impact skills — auto-generated skills for critical domains should be reviewed by a domain expert before deployment.

Platform Compatibility

Feature	Claude Code	Cursor	Codex	Gemini CLI
Session logging	✅ Hooks	✅ Extensions	✅ Custom	✅ Custom
Pattern mining	✅ External scripts	✅ External scripts	✅ External scripts	✅ External scripts
Skill generation	✅ SKILL.md compat	✅ SKILL.md native	✅ Instructions	✅ System prompts
A/B testing	✅ Custom	✅ Custom	✅ Custom	✅ Custom
Auto-deployment	✅ File write hooks	✅ Skill directory	⚠️ Manual	⚠️ Manual

Related Skills

Memory Persistence - Session data persistence provides the raw material for cross-session pattern mining
Knowledge Base Injection - Mined patterns are deployed as injectable knowledge for future agent sessions
Entity Memory Management - Entity extraction feeds the pattern mining pipeline with structured observations
PMax Optimization - Auto-discovers optimization patterns from Performance Max campaign data

Keywords

continuous-learning, pattern-mining, skill-generation, a-b-testing, feedback-loops, auto-improvement, session-analysis, knowledge-derivation, skill-versioning, agent-skills

continuous-learning

Continuous Learning

Description

Use When

How It Works

Implementation

Best Practices

Platform Compatibility

Related Skills

Keywords

More from itallstartedwithaidea/agent-skills

google-ads-audit

cognitive-scaffolding

keyword-research

cloudflare-workers

git-worktrees

bioinformatics