NYC
skills/vamseeachanta/workspace-hub/optimization-load-balancer

optimization-load-balancer

SKILL.md

Load Balancing Coordinator Skill

Overview

This skill provides comprehensive load balancing capabilities including work-stealing algorithms, dynamic task distribution, queue management, and adaptive resource allocation for optimal swarm coordination.

When to Use

  • Distributing tasks across multiple agents efficiently
  • Preventing agent overload while maximizing utilization
  • Implementing fair scheduling across different task priorities
  • Optimizing throughput in distributed systems
  • Handling variable workloads with adaptive balancing
  • Migrating tasks from overloaded to underloaded agents

Quick Start

# Initialize load balancer
npx claude-flow agent spawn load-balancer --type coordinator

# Start load balancing
npx claude-flow load-balance --swarm-id <id> --strategy adaptive

# Monitor load distribution
npx claude-flow agent-metrics --type load-balancer

# Adjust balancing parameters
npx claude-flow config-manage --action update --config '{"stealThreshold": 5, "agingBoost": 10}'

Architecture

+-----------------------------------------------------------+
|                Load Balancing Coordinator                  |
+-----------------------------------------------------------+
|  Work Stealer  |  Load Balancer  |  Queue Manager         |
+----------------+-----------------+-------------------------+
        |                |                    |
        v                v                    v
+---------------+  +-----------------+  +------------------+
| Victim Select |  | Agent Capacity  |  | Priority Queues  |
| - Heaviest    |  | - Load Tracking |  | - Critical       |
| - Threshold   |  | - Performance   |  | - High/Normal    |
| - Locality    |  | - Migration     |  | - Low/Background |
+---------------+  +-----------------+  +------------------+
        |                |                    |
        v                v                    v
+-----------------------------------------------------------+
|              Resource Optimization Engine                  |
+-----------------------------------------------------------+

Core Capabilities

1. Work-Stealing Algorithm

Efficiently redistributes work from overloaded agents:

// Work-stealing configuration
const workStealing = {
  stealThreshold: 5,       // Steal when queue > 5 tasks
  stealPercentage: 0.5,    // Take 50% of victim's queue
  victimSelection: 'heaviest',  // Steal from busiest agent
  localityAware: true      // Prefer nearby agents
};

// Victim selection strategies:
// - heaviest: Steal from agent with most tasks
// - random: Random selection for fairness
// - locality: Prefer agents in same topology region

2. Dynamic Load Balancing

Real-time load distribution:

Strategy Description Best For
Round Robin Sequential distribution Uniform tasks
Weighted Based on agent capacity Heterogeneous agents
Least Connections To least loaded agent Variable task duration
Adaptive ML-based optimization Complex workloads

3. Queue Management

Multi-level feedback queue scheduling:

Priority Level Weight Use Case
Critical 40% System-critical tasks
High 30% User-facing operations
Normal 20% Standard processing
Low 10% Background tasks

4. Resource Allocation

Multi-objective optimization:

  • Minimize latency
  • Maximize utilization
  • Balance load
  • Minimize cost

Scheduling Algorithms

Earliest Deadline First (EDF)

// EDF for real-time task scheduling
const edfScheduler = {
  schedule(tasks) {
    return tasks.sort((a, b) => a.deadline - b.deadline);
  },

  // Liu & Layland utilization bound
  admissionControl(newTask, existingTasks) {
    const utilization = [...existingTasks, newTask]
      .reduce((sum, t) => sum + (t.executionTime / t.period), 0);
    return utilization <= 1.0;
  }
};

Completely Fair Scheduler (CFS)

// CFS for fair task distribution
const cfsScheduler = {
  virtualRuntime: new Map(),
  weights: new Map(),

  schedule() {
    // Select task with minimum virtual runtime
    return this.getMinVirtualRuntimeTask();
  },

  updateVirtualRuntime(task, elapsedTime) {
    const weight = this.weights.get(task.id) || 1;
    const vruntime = this.virtualRuntime.get(task.id) || 0;
    this.virtualRuntime.set(task.id, vruntime + (elapsedTime / weight));
  }
};

Weighted Fair Queuing (WFQ)

Proportional bandwidth allocation based on agent weights.

MCP Integration

// MCP load balancing integration
const loadBalancingIntegration = {
  // Real-time metrics collection
  async collectMetrics() {
    const [performance, bottlenecks, tokenUsage] = await Promise.all([
      mcp.performance_report({ format: 'json' }),
      mcp.bottleneck_analyze({}),
      mcp.token_usage({})
    ]);

    return { performance, bottlenecks, tokenUsage, timestamp: Date.now() };
  },

  // Execute load balancing
  async coordinateLoadBalancing(swarmId) {
    const agents = await mcp.agent_list({ swarmId });
    const metrics = await mcp.agent_metrics({});

    const rebalancing = this.calculateRebalancing(agents, metrics);

    if (rebalancing.required) {
      await mcp.load_balance({
        swarmId,
        tasks: rebalancing.taskMigrations
      });
    }

    return rebalancing;
  }
};

Circuit Breaker Pattern

Protect against cascade failures:

const circuitBreaker = {
  state: 'CLOSED',           // CLOSED, OPEN, HALF_OPEN
  failureThreshold: 5,       // Open after 5 failures
  successThreshold: 3,       // Close after 3 successes
  timeout: 60000,            // Recovery timeout (ms)

  async execute(operation, fallback) {
    if (this.state === 'OPEN' && !this.shouldAttemptReset()) {
      return fallback ? await fallback() : null;
    }

    try {
      const result = await operation();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      if (fallback) return fallback();
      throw error;
    }
  }
};

Key Metrics

Performance Indicators

Metric Description Target
Load Distribution Variance Balance across agents < 0.1
Task Migration Rate Work-stealing frequency < 5%
Queue Latency Time in queue < 100ms
Utilization Efficiency Resource usage > 80%
Fairness Index Jain's fairness > 0.9

Benchmarking

// Load balancer benchmarks
const benchmarks = {
  async throughputTest(taskCount, agentCount) {
    const startTime = performance.now();
    await this.distributeAndExecute(taskCount, agentCount);
    const endTime = performance.now();

    return {
      throughput: taskCount / ((endTime - startTime) / 1000),
      averageLatency: (endTime - startTime) / taskCount
    };
  },

  async loadBalanceEfficiency(tasks, agents) {
    const distribution = await this.distributeLoad(tasks, agents);
    const idealLoad = tasks.length / agents.length;

    const variance = distribution.reduce((sum, load) =>
      sum + Math.pow(load - idealLoad, 2), 0) / agents.length;

    return {
      efficiency: 1 / (1 + variance),
      loadVariance: variance
    };
  }
};

Commands Reference

# Real-time load monitoring
npx claude-flow performance-report --format detailed

# Bottleneck analysis
npx claude-flow bottleneck-analyze --component swarm-coordination

# Resource utilization tracking
npx claude-flow metrics-collect --components ["load-balancer", "task-queue"]

# Configure load balancing strategy
npx claude-flow config-manage --action update \
  --config '{"strategy": "adaptive", "threshold": 0.8}'

Integration Points

Integration Purpose
Performance Monitor Real-time metrics for load decisions
Topology Optimizer Coordinate topology changes with load
Resource Allocator Optimize resource distribution
Task Orchestrator Receive load-balanced assignments

Best Practices

  1. Gradual Migration: Move tasks incrementally to avoid oscillation
  2. Locality Awareness: Prefer local task execution to minimize latency
  3. Priority Preservation: Maintain task priorities during migration
  4. Monitoring: Track load balance metrics continuously
  5. Adaptive Thresholds: Adjust thresholds based on workload patterns
  6. Circuit Breakers: Protect against cascade failures

Example: Adaptive Load Balancing

// Adaptive load balancing strategy
const adaptiveBalancer = {
  config: {
    checkInterval: 5000,     // Check every 5 seconds
    migrationThreshold: 0.3, // Migrate if imbalance > 30%
    cooldownPeriod: 30000,   // Wait 30s between migrations
    maxMigrations: 5         // Max 5 migrations per cycle
  },

  async balance(swarm) {
    const loads = await this.getAgentLoads(swarm);
    const average = loads.reduce((a, b) => a + b) / loads.length;

    const overloaded = loads.filter(l => l > average * 1.3);
    const underloaded = loads.filter(l => l < average * 0.7);

    if (overloaded.length > 0 && underloaded.length > 0) {
      await this.migrateTasks(overloaded, underloaded);
    }
  }
};

Related Skills

  • optimization-monitor - Real-time performance monitoring
  • optimization-resources - Resource allocation and scaling
  • optimization-topology - Network topology optimization
  • optimization-benchmark - Performance validation

Version History

  • 1.0.0 (2026-01-02): Initial release - converted from load-balancer agent with work-stealing, queue management, scheduling algorithms, circuit breaker pattern, and adaptive balancing
Weekly Installs
4
First Seen
Jan 24, 2026
Installed on
claude-code3
windsurf2
trae2
opencode2
cursor2
codex2