Load Testing

Purpose

Validate application performance under realistic and peak load conditions, identify scalability bottlenecks, and ensure systems meet performance SLAs.

When to Use

Before production deployment
After performance optimizations
Capacity planning
Validating scalability
Testing under expected peak loads

Key Capabilities

Load Test Design - Create realistic test scenarios and user patterns
Performance Validation - Measure response times, throughput, and resource usage
Scalability Analysis - Identify breaking points and bottlenecks

Approach

Define Performance Requirements
- Target response times (p50, p95, p99)
- Expected throughput (requests/second)
- Concurrent users
- Resource limits (CPU, memory)
Design Test Scenarios
- User workflows (login → browse → checkout)
- Traffic patterns (gradual ramp, spike, sustained)
- Data variations (small/large payloads)
- Think time between requests
Select Tools
- k6: Modern, JavaScript-based, great reporting
- JMeter: Feature-rich, GUI-based
- Locust: Python-based, distributed testing
- Gatling: Scala-based, detailed reports
- ab/wrk: Simple command-line tools
Execute Tests
- Start with baseline (single user)
- Ramp up gradually
- Test at target load
- Spike test (sudden traffic increase)
- Endurance test (sustained load)
Analyze Results
- Response time percentiles
- Error rates
- Throughput degradation
- Resource utilization
- Database connection pool exhaustion

Example

Context: Load testing an e-commerce API

Test Scenario (k6):

// load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate } from 'k6/metrics';

const errorRate = new Rate('errors');

export const options = {
  stages: [
    { duration: '2m', target: 100 },  // Ramp to 100 users
    { duration: '5m', target: 100 },  // Stay at 100
    { duration: '2m', target: 200 },  // Ramp to 200
    { duration: '5m', target: 200 },  // Stay at 200
    { duration: '2m', target: 0 },    // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500'], // 95% under 500ms
    http_req_failed: ['rate<0.01'],    // <1% errors
  },
};

export default function () {
  // Browse products
  let res = http.get('https://api.example.com/products');
  check(res, {
    'products loaded': (r) => r.status === 200,
    'response < 500ms': (r) => r.timings.duration < 500,
  }) || errorRate.add(1);
  
  sleep(1); // Think time
  
  // View product detail
  res = http.get('https://api.example.com/products/123');
  check(res, {
    'product detail loaded': (r) => r.status === 200,
  }) || errorRate.add(1);
  
  sleep(2);
  
  // Add to cart
  res = http.post('https://api.example.com/cart', JSON.stringify({
    product_id: 123,
    quantity: 1
  }), {
    headers: { 'Content-Type': 'application/json' },
  });
  check(res, {
    'added to cart': (r) => r.status === 201,
  }) || errorRate.add(1);
  
  sleep(1);
}

Run Test:

k6 run --out json=results.json load-test.js

Results Analysis:

scenarios: (100.00%) 1 scenario, 200 max VUs, 18m0s max duration

     ✓ products loaded
     ✓ response < 500ms
     ✓ product detail loaded
     ✓ added to cart

     checks.........................: 98.50% ✓ 19700  ✗ 300
     data_received..................: 45 MB  2.8 MB/s
     data_sent......................: 8.5 MB 530 kB/s
     http_req_blocked...............: avg=1.2ms    min=0s     med=0s      max=150ms   p(95)=5ms    p(99)=25ms
     http_req_duration..............: avg=245ms    min=45ms   med=180ms   max=2.5s    p(95)=480ms  p(99)=850ms
     http_req_failed................: 1.50%  ✓ 300   ✗ 19700
     http_reqs......................: 20000  1250/s
     iteration_duration.............: avg=4.8s     min=4s     med=4.5s    max=8s
     iterations.....................: 5000   312.5/s
     vus............................: 200    min=0    max=200
     vus_max........................: 200    min=200  max=200

Analysis:

✅ p95 response time: 480ms (meets <500ms threshold)
⚠️ p99 response time: 850ms (exceeds threshold)
⚠️ Error rate: 1.5% (exceeds <1% threshold)
Throughput: 1250 requests/second
System starts degrading above 200 concurrent users

Bottleneck Investigation:

Errors occur during "add to cart" operation
Database connection pool exhausted at peak load
CPU usage spikes to 95% at 200 users

Recommendations:

Increase database connection pool size
Add caching for product catalog
Optimize cart operations
Consider horizontal scaling

Best Practices

✅ Test in production-like environment
✅ Use realistic user scenarios, not just endpoint hammering
✅ Ramp up gradually (don't spike immediately)
✅ Monitor system metrics during tests (CPU, memory, DB connections)
✅ Test at 2-3x expected peak load
✅ Include think time between requests
✅ Vary request payloads and parameters
✅ Run endurance tests (24+ hours for stability)
❌ Avoid: Testing from same network as application
❌ Avoid: Ignoring failed requests in results
❌ Avoid: Testing only happy paths
❌ Avoid: Testing with empty databases