performance-testing
Performance Testing
Performance Testing Types
Load Testing
Purpose: Verify system performance under expected load
- Simulates expected user traffic and data volume
- Identifies performance bottlenecks under normal conditions
- Establishes performance baselines
- Validates SLA compliance
Key Metrics:
- Response time (average, median, p95, p99)
- Throughput (requests per second, transactions per second)
- Error rate
- Resource utilization (CPU, memory, disk, network)
Stress Testing
Purpose: Identify system breaking points
- Exceeds expected load to find limits
- Tests system recovery after failure
- Identifies failure modes and error handling
- Validates graceful degradation
Key Metrics:
- Maximum concurrent users before failure
- Maximum throughput before failure
- Time to recover after load reduction
- Error patterns and failure modes
Spike Testing
Purpose: Handle sudden traffic increases
- Simulates sudden traffic spikes (e.g., flash sales, viral content)
- Tests system elasticity and auto-scaling
- Validates queuing and throttling mechanisms
- Identifies race conditions under load
Key Metrics:
- Response time during spike
- Error rate during spike
- Time to stabilize after spike
- Queue depth and processing time
Soak Testing
Purpose: Verify stability over extended periods
- Runs sustained load for hours or days
- Identifies memory leaks and resource exhaustion
- Tests database connection pool stability
- Validates garbage collection efficiency
Key Metrics:
- Memory usage over time
- Response time trends
- Error rate over time
- Resource utilization trends
Volume Testing
Purpose: Test with large data volumes
- Tests performance with realistic data sizes
- Identifies database query performance issues
- Tests file system and storage performance
- Validates data migration performance
Key Metrics:
- Query execution time with large datasets
- Index usage and effectiveness
- Storage I/O performance
- Data processing throughput
Performance Testing Tools
JMeter
Best for: Load and stress testing
- Open source, Java-based
- Supports multiple protocols (HTTP, JDBC, JMS, etc.)
- Distributed testing support
- Extensive plugin ecosystem
- GUI and CLI modes
<!-- JMeter Test Plan Example -->
<?xml version="1.0" encoding="UTF-8"?>
<jmeterTestPlan>
<hashTree>
<TestPlan guiclass="TestPlanGui">
<stringProp name="TestPlan.comments">Load Test</stringProp>
</TestPlan>
<hashTree>
<ThreadGroup guiclass="ThreadGroupGui">
<stringProp name="ThreadGroup.num_threads">100</stringProp>
<stringProp name="ThreadGroup.ramp_time">10</stringProp>
<stringProp name="ThreadGroup.duration">60</stringProp>
</ThreadGroup>
<hashTree>
<HTTPSamplerProxy guiclass="HttpTestSampleGui">
<stringProp name="HTTPSampler.domain">example.com</stringProp>
<stringProp name="HTTPSampler.path">/api/users</stringProp>
</HTTPSamplerProxy>
</hashTree>
</hashTree>
</hashTree>
</jmeterTestPlan>
Gatling
Best for: High-performance load testing
- Scala-based, DSL for test scenarios
- High performance, low resource usage
- Real-time metrics and reporting
- Good for continuous integration
- Supports HTTP, WebSocket, JMS
// Gatling Example
import io.gatling.core.Predef._
import io.gatling.http.Predef._
class LoadTest extends Simulation {
val httpProtocol = http.baseUrl("https://example.com")
val scn = scenario("User Journey")
.exec(http("Get Users").get("/api/users"))
.pause(1)
.exec(http("Get User").get("/api/users/1"))
setUp(
scn.inject(
rampUsers(100).during(10.seconds),
constantUsersPerSec(50).during(60.seconds)
)
).protocols(httpProtocol)
}
k6
Best for: Developer-friendly performance testing
- JavaScript-based, easy to learn
- Modern CLI and cloud integration
- Good for CI/CD pipelines
- Supports HTTP/1.1, HTTP/2, WebSocket
- Grafana integration for visualization
// k6 Example
import http from 'k6/http';
import { check, sleep } from 'k6';
export let options = {
stages: [
{ duration: '10s', target: 100 },
{ duration: '60s', target: 100 },
{ duration: '10s', target: 0 },
],
};
export default function () {
let res = http.get('https://example.com/api/users');
check(res, {
'status was 200': (r) => r.status == 200,
'response time < 500ms': (r) => r.timings.duration < 500,
});
sleep(1);
}
Locust
Best for: Python-based load testing
- Python-based, easy to write tests
- Web UI for real-time monitoring
- Distributed testing support
- Good for complex user scenarios
- Event-based architecture
# Locust Example
from locust import HttpUser, task, between
class WebsiteUser(HttpUser):
wait_time = between(1, 3)
@task
def get_users(self):
self.client.get("/api/users")
@task(2)
def get_user(self):
self.client.get("/api/users/1")
Key Performance Metrics
Response Time
- Average: Mean response time across all requests
- Median: Middle value, less affected by outliers
- p95: 95th percentile, 95% of requests complete within this time
- p99: 99th percentile, 99% of requests complete within this time
- Min/Max: Fastest and slowest response times
Throughput
- Requests Per Second (RPS): Number of requests handled per second
- Transactions Per Second (TPS): Number of business transactions per second
- Concurrent Users: Number of simultaneous users
- Hits Per Second: Number of HTTP requests per second
Error Rate
- HTTP Error Rate: Percentage of HTTP errors (4xx, 5xx)
- Application Error Rate: Percentage of application-level errors
- Timeout Rate: Percentage of requests that timed out
- Connection Error Rate: Percentage of connection failures
Resource Utilization
- CPU Usage: Processor utilization percentage
- Memory Usage: RAM consumption and availability
- Disk I/O: Read/write operations and latency
- Network I/O: Bandwidth utilization and latency
- Database Connections: Active and idle connection counts
Performance Profiling
Application Profiling
- CPU Profiling: Identify CPU-intensive methods
- Memory Profiling: Detect memory leaks and allocation patterns
- Thread Profiling: Identify thread contention and deadlocks
- Database Profiling: Analyze query performance and execution plans
Tools
- Java: JProfiler, VisualVM, YourKit
- Node.js: Node.js Profiler, Clinic.js
- Python: cProfile, Py-Spy
- Go: pprof
- .NET: dotTrace, Visual Studio Profiler
Bottleneck Identification
- Database: Slow queries, missing indexes, N+1 queries
- Network: Latency, bandwidth limitations, connection pooling
- Application: Inefficient algorithms, excessive object creation
- External Services: Third-party API latency, rate limiting
- Caching: Cache misses, stale data, cache stampede
Performance Baselines and SLAs
Establishing Baselines
- Run tests in production-like environment
- Collect metrics over multiple runs
- Account for normal variability
- Document test conditions and data
- Store baselines in version control
SLA Definitions
- Response Time SLAs: Maximum acceptable response times
- Availability SLAs: Minimum uptime requirements (e.g., 99.9%)
- Throughput SLAs: Minimum requests per second
- Error Rate SLAs: Maximum acceptable error rate
Example SLAs
API Response Times:
- p50 < 200ms
- p95 < 500ms
- p99 < 1000ms
Availability: 99.9% (8.76 hours downtime/year)
Error Rate: < 0.1%
Throughput: 1000 RPS
Cloud-Based Performance Testing
Cloud Testing Benefits
- Scalable infrastructure on demand
- Geographic distribution
- Realistic load simulation
- Pay-as-you-go pricing
- Integration with cloud services
Cloud Testing Platforms
- AWS: EC2, Lambda, Fargate for distributed testing
- Google Cloud: Compute Engine, Cloud Functions
- Azure: Virtual Machines, Azure Functions
- Managed Services: BlazeMeter, LoadRunner Cloud, k6 Cloud
Cloud Testing Best Practices
- Use multiple regions for geographic testing
- Leverage auto-scaling for flexible load
- Monitor cloud costs during testing
- Clean up resources after testing
- Use cloud-native monitoring and logging
Performance Test Planning
Test Scenarios
- Define realistic user journeys
- Identify critical paths
- Include happy path and edge cases
- Account for different user types
- Consider peak and off-peak patterns
Load Models
- Constant Load: Steady user count over time
- Ramp-up Load: Gradually increase users
- Spike Load: Sudden increase in users
- Step Load: Incremental increases with plateaus
- Random Load: Variable user patterns
Test Data
- Use realistic data volumes
- Include edge cases and boundary values
- Account for data distribution
- Refresh data between test runs
- Consider data privacy and security
Environment Setup
- Mirror production configuration
- Use production-like data
- Monitor system resources
- Isolate test environment
- Document environment differences
More from davincidreams/agent-team-plugins
blender
Blender interface, workflows, and 3D production pipeline
220rigging
Rigging fundamentals, skeleton setup, and animation controls
16animation
Animation principles, techniques, and best practices for 3D animation
13vroid
Vroid Studio, VRM format, and VTuber avatar creation
10technical-writing
Technical writing principles and best practices for creating clear, accurate documentation
9unreal
Unreal Engine patterns, Actor/Component model, Blueprints vs C++, and best practices
8