Skip to content

Phase 9.2: Implement Load Testing with k6 and Performance Baselining #112

@artcava

Description

@artcava

📋 Task Description

Implement comprehensive load testing using k6 to measure system performance under various load conditions. Create test scripts for different scenarios, establish performance baselines, identify bottlenecks, and document performance characteristics. Test with various policy configurations to understand their impact on throughput and latency.

🎯 Objectives

  • Install and configure k6 load testing tool
  • Create k6 test scripts for API endpoints
  • Implement load test scenarios (smoke, load, stress, spike)
  • Test with various policy configurations
  • Establish performance baselines
  • Measure throughput (requests/second)
  • Measure latency (p50, p95, p99)
  • Identify system bottlenecks
  • Test database performance under load
  • Test message broker performance under load
  • Test resilience policy overhead
  • Document performance characteristics
  • Create performance monitoring dashboards

📦 Deliverables

1. Install k6

Create tests/LoadTests/README.md:

# Load Testing with k6

## Installation

### macOS
```bash
brew install k6

Linux

sudo gpg -k
sudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update
sudo apt-get install k6

Windows

choco install k6

Docker

docker pull grafana/k6:latest

Running Tests

# Run smoke test
k6 run tests/LoadTests/smoke-test.js

# Run load test
k6 run tests/LoadTests/load-test.js

# Run stress test
k6 run tests/LoadTests/stress-test.js

# Run with custom duration
k6 run --duration 5m tests/LoadTests/load-test.js

# Run with custom VUs
k6 run --vus 50 tests/LoadTests/load-test.js

# Output results to file
k6 run --out json=results.json tests/LoadTests/load-test.js

### 2. Create Base Test Configuration

Create `tests/LoadTests/config.js`:

```javascript
// Base configuration for k6 tests
export const BASE_URL = __ENV.BASE_URL || 'http://localhost:5000';

export const THRESHOLDS = {
  // HTTP response time thresholds
  http_req_duration: ['p(95)<500', 'p(99)<1000'],
  
  // HTTP request rate
  http_reqs: ['rate>10'],
  
  // HTTP failure rate
  http_req_failed: ['rate<0.01'], // Less than 1% failure
  
  // Checks success rate
  checks: ['rate>0.95'], // 95% success rate
};

export const SCENARIOS = {
  smoke: {
    executor: 'constant-vus',
    vus: 1,
    duration: '30s',
  },
  load: {
    executor: 'ramping-vus',
    startVUs: 0,
    stages: [
      { duration: '1m', target: 10 },
      { duration: '3m', target: 10 },
      { duration: '1m', target: 0 },
    ],
    gracefulRampDown: '30s',
  },
  stress: {
    executor: 'ramping-vus',
    startVUs: 0,
    stages: [
      { duration: '2m', target: 10 },
      { duration: '2m', target: 50 },
      { duration: '2m', target: 100 },
      { duration: '2m', target: 0 },
    ],
    gracefulRampDown: '30s',
  },
  spike: {
    executor: 'ramping-vus',
    startVUs: 0,
    stages: [
      { duration: '10s', target: 5 },
      { duration: '10s', target: 100 }, // Spike
      { duration: '3m', target: 100 },
      { duration: '10s', target: 5 },
      { duration: '10s', target: 0 },
    ],
    gracefulRampDown: '30s',
  },
};

3. Create Process Creation Load Test

Create tests/LoadTests/process-creation-test.js:

import http from 'k6/http';
import { check, sleep } from 'k6';
import { Counter, Trend } from 'k6/metrics';
import { BASE_URL, THRESHOLDS, SCENARIOS } from './config.js';

// Custom metrics
const processCreationErrors = new Counter('process_creation_errors');
const processCreationDuration = new Trend('process_creation_duration');

export const options = {
  scenarios: {
    load_test: SCENARIOS.load,
  },
  thresholds: THRESHOLDS,
};

let processCounter = 0;

export default function () {
  const clientId = `load-test-client-${__VU}`;
  const processType = 'order';
  const clientProcessId = `process-${__ITER}-${Date.now()}-${processCounter++}`;

  const payload = JSON.stringify({
    clientId: clientId,
    processType: processType,
    clientProcessId: clientProcessId,
    metadata: {
      orderId: `order-${__ITER}`,
      customerId: `customer-${__VU}`,
      amount: '100.00',
    },
  });

  const params = {
    headers: {
      'Content-Type': 'application/json',
    },
    tags: {
      name: 'CreateProcess',
    },
  };

  const startTime = Date.now();
  const response = http.post(`${BASE_URL}/api/processes`, payload, params);
  const duration = Date.now() - startTime;

  processCreationDuration.add(duration);

  const checkResult = check(response, {
    'status is 201': (r) => r.status === 201,
    'has location header': (r) => r.headers['Location'] !== undefined,
    'response time < 500ms': (r) => duration < 500,
  });

  if (!checkResult) {
    processCreationErrors.add(1);
    console.error(`Failed to create process: ${response.status} - ${response.body}`);
  }

  // Extract processId from location header
  if (response.status === 201) {
    const location = response.headers['Location'];
    const processId = location.split('/').pop();
    
    // Verify process was created
    sleep(0.5); // Wait for async processing
    
    const getResponse = http.get(`${BASE_URL}/api/processes/${processId}`);
    check(getResponse, {
      'process exists': (r) => r.status === 200,
      'process has valid status': (r) => {
        const process = JSON.parse(r.body);
        return ['Pending', 'Processing', 'Completed'].includes(process.status);
      },
    });
  }

  sleep(1); // Think time
}

export function handleSummary(data) {
  return {
    'results/process-creation-summary.json': JSON.stringify(data),
    'results/process-creation-summary.html': htmlReport(data),
  };
}

function htmlReport(data) {
  const metrics = data.metrics;
  return `
<!DOCTYPE html>
<html>
<head>
  <title>Process Creation Load Test Results</title>
  <style>
    body { font-family: Arial, sans-serif; margin: 20px; }
    .metric { margin: 10px 0; padding: 10px; background: #f0f0f0; }
    .success { color: green; }
    .failure { color: red; }
  </style>
</head>
<body>
  <h1>Process Creation Load Test Results</h1>
  <div class="metric">
    <h3>Requests</h3>
    <p>Total: ${metrics.http_reqs.values.count}</p>
    <p>Rate: ${metrics.http_reqs.values.rate.toFixed(2)} req/s</p>
  </div>
  <div class="metric">
    <h3>Response Time</h3>
    <p>Avg: ${metrics.http_req_duration.values.avg.toFixed(2)} ms</p>
    <p>P95: ${metrics.http_req_duration.values['p(95)'].toFixed(2)} ms</p>
    <p>P99: ${metrics.http_req_duration.values['p(99)'].toFixed(2)} ms</p>
  </div>
  <div class="metric">
    <h3>Success Rate</h3>
    <p class="${metrics.checks.values.rate > 0.95 ? 'success' : 'failure'}">
      ${(metrics.checks.values.rate * 100).toFixed(2)}%
    </p>
  </div>
</body>
</html>
  `;
}

4. Create Smoke Test

Create tests/LoadTests/smoke-test.js:

import http from 'k6/http';
import { check, sleep } from 'k6';
import { BASE_URL } from './config.js';

export const options = {
  vus: 1,
  duration: '30s',
  thresholds: {
    http_req_duration: ['p(99)<1000'],
    http_req_failed: ['rate<0.01'],
  },
};

export default function () {
  // Test health endpoint
  let response = http.get(`${BASE_URL}/health`);
  check(response, {
    'health check is 200': (r) => r.status === 200,
  });

  sleep(1);

  // Test process creation
  const payload = JSON.stringify({
    clientId: 'smoke-test-client',
    processType: 'order',
    clientProcessId: `smoke-${Date.now()}`,
    metadata: {
      orderId: 'smoke-order-1',
      customerId: 'smoke-customer-1',
      amount: '50.00',
    },
  });

  response = http.post(
    `${BASE_URL}/api/processes`,
    payload,
    { headers: { 'Content-Type': 'application/json' } }
  );

  check(response, {
    'process creation is 201': (r) => r.status === 201,
  });

  sleep(1);
}

5. Create Stress Test

Create tests/LoadTests/stress-test.js:

import http from 'k6/http';
import { check, sleep } from 'k6';
import { BASE_URL, SCENARIOS } from './config.js';

export const options = {
  scenarios: {
    stress: SCENARIOS.stress,
  },
  thresholds: {
    http_req_duration: ['p(95)<1000', 'p(99)<2000'],
    http_req_failed: ['rate<0.05'], // Allow 5% failure under stress
  },
};

export default function () {
  const payload = JSON.stringify({
    clientId: `stress-client-${__VU}`,
    processType: 'order',
    clientProcessId: `stress-${__VU}-${__ITER}-${Date.now()}`,
    metadata: {
      orderId: `order-${__ITER}`,
      customerId: `customer-${__VU}`,
      amount: '75.00',
    },
  });

  const response = http.post(
    `${BASE_URL}/api/processes`,
    payload,
    { headers: { 'Content-Type': 'application/json' } }
  );

  check(response, {
    'status is 2xx or 429': (r) => r.status < 300 || r.status === 429,
  });

  sleep(0.5); // Shorter think time for stress
}

6. Create Policy Configuration Test

Create tests/LoadTests/policy-configuration-test.js:

import http from 'k6/http';
import { check, group } from 'k6';
import { BASE_URL } from './config.js';

export const options = {
  scenarios: {
    low_retries: {
      executor: 'constant-vus',
      vus: 10,
      duration: '1m',
      tags: { policy: 'low_retries' },
      exec: 'testLowRetries',
    },
    high_retries: {
      executor: 'constant-vus',
      vus: 10,
      duration: '1m',
      startTime: '1m',
      tags: { policy: 'high_retries' },
      exec: 'testHighRetries',
    },
    short_timeout: {
      executor: 'constant-vus',
      vus: 10,
      duration: '1m',
      startTime: '2m',
      tags: { policy: 'short_timeout' },
      exec: 'testShortTimeout',
    },
  },
};

export function testLowRetries() {
  group('Low Retries Policy', () => {
    const payload = JSON.stringify({
      clientId: 'low-retry-client',
      processType: 'order',
      clientProcessId: `low-${__VU}-${__ITER}`,
      metadata: { orderId: `order-${__ITER}` },
    });

    const response = http.post(`${BASE_URL}/api/processes`, payload, {
      headers: { 'Content-Type': 'application/json' },
    });

    check(response, {
      'created with low retries': (r) => r.status === 201,
    });
  });
}

export function testHighRetries() {
  group('High Retries Policy', () => {
    const payload = JSON.stringify({
      clientId: 'high-retry-client',
      processType: 'order',
      clientProcessId: `high-${__VU}-${__ITER}`,
      metadata: { orderId: `order-${__ITER}` },
    });

    const response = http.post(`${BASE_URL}/api/processes`, payload, {
      headers: { 'Content-Type': 'application/json' },
    });

    check(response, {
      'created with high retries': (r) => r.status === 201,
    });
  });
}

export function testShortTimeout() {
  group('Short Timeout Policy', () => {
    const payload = JSON.stringify({
      clientId: 'short-timeout-client',
      processType: 'order',
      clientProcessId: `timeout-${__VU}-${__ITER}`,
      metadata: { orderId: `order-${__ITER}` },
    });

    const response = http.post(`${BASE_URL}/api/processes`, payload, {
      headers: { 'Content-Type': 'application/json' },
    });

    check(response, {
      'created with short timeout': (r) => r.status === 201,
    });
  });
}

7. Create Performance Analysis Script

Create tests/LoadTests/analyze-results.js:

import { readFileSync } from 'fs';

const resultsFile = process.argv[2];
if (!resultsFile) {
  console.error('Usage: node analyze-results.js <results-file.json>');
  process.exit(1);
}

const data = JSON.parse(readFileSync(resultsFile, 'utf8'));
const metrics = data.metrics;

console.log('\n=== Performance Analysis ===\n');

// Throughput
const throughput = metrics.http_reqs.values.rate;
console.log(`Throughput: ${throughput.toFixed(2)} req/s`);

// Latency
const latency = metrics.http_req_duration.values;
console.log(`\nLatency:`);
console.log(`  Average: ${latency.avg.toFixed(2)} ms`);
console.log(`  P50: ${latency['p(50)'].toFixed(2)} ms`);
console.log(`  P95: ${latency['p(95)'].toFixed(2)} ms`);
console.log(`  P99: ${latency['p(99)'].toFixed(2)} ms`);
console.log(`  Max: ${latency.max.toFixed(2)} ms`);

// Success rate
const successRate = metrics.checks.values.rate * 100;
console.log(`\nSuccess Rate: ${successRate.toFixed(2)}%`);

// Failures
const failureRate = metrics.http_req_failed.values.rate * 100;
console.log(`Failure Rate: ${failureRate.toFixed(2)}%`);

// Thresholds
console.log('\n=== Threshold Evaluation ===\n');
for (const [name, threshold] of Object.entries(data.thresholds || {})) {
  const passed = threshold.ok ? '✓ PASS' : '✗ FAIL';
  console.log(`${passed} ${name}`);
}

// Recommendations
console.log('\n=== Recommendations ===\n');

if (latency['p(95)'] > 500) {
  console.log('⚠️  P95 latency exceeds 500ms - consider optimization');
}

if (throughput < 10) {
  console.log('⚠️  Throughput below 10 req/s - check infrastructure capacity');
}

if (failureRate > 1) {
  console.log('⚠️  Failure rate above 1% - investigate errors');
}

if (successRate > 99 && latency['p(95)'] < 300 && throughput > 50) {
  console.log('✅ Excellent performance - system is well optimized');
}

console.log('\n');

8. Create Performance Documentation

Create docs/PERFORMANCE-CHARACTERISTICS.md:

# Performance Characteristics - StarGate

## Baseline Performance

### Test Environment
- **Infrastructure:** Docker Compose (local)
- **MongoDB:** 7.0
- **Redis:** 7.2
- **RabbitMQ:** 3.13
- **Hardware:** Development machine

### Process Creation

| Metric | Value | Target |
|--------|-------|--------|
| Throughput | 50 req/s | >10 req/s |
| P50 Latency | 120 ms | <200 ms |
| P95 Latency | 350 ms | <500 ms |
| P99 Latency | 800 ms | <1000 ms |
| Success Rate | 99.5% | >99% |

### Process Query

| Metric | Value | Target |
|--------|-------|--------|
| Throughput | 200 req/s | >50 req/s |
| P50 Latency | 45 ms | <100 ms |
| P95 Latency | 120 ms | <200 ms |
| P99 Latency | 250 ms | <500 ms |
| Success Rate | 99.9% | >99% |

## Load Test Scenarios

### Smoke Test
- **Duration:** 30 seconds
- **Virtual Users:** 1
- **Purpose:** Verify basic functionality

### Load Test
- **Duration:** 5 minutes
- **Virtual Users:** Ramp 0 → 10 → 10 → 0
- **Purpose:** Establish baseline performance

### Stress Test
- **Duration:** 8 minutes
- **Virtual Users:** Ramp 0 → 10 → 50 → 100 → 0
- **Purpose:** Find breaking point

### Spike Test
- **Duration:** 4 minutes
- **Virtual Users:** Spike to 100
- **Purpose:** Test recovery from sudden load

## Policy Impact on Performance

### Retry Policy
- **No Retry:** Baseline
- **3 Retries:** +2-5% latency
- **5 Retries:** +5-10% latency

### Circuit Breaker
- **Closed:** +1% latency
- **Open:** -95% latency (fail fast)

### Timeout Policy
- **30s:** Minimal impact
- **10s:** +0.5% latency
- **5s:** +1% latency

## Bottlenecks

### Identified
1. **MongoDB Write Latency:** ~50ms per insert
2. **RabbitMQ Publish:** ~10ms per message
3. **Handler Execution:** ~500ms average

### Mitigation
1. Batch database operations
2. Connection pooling
3. Async processing
4. Caching frequent queries

## Scalability

### Horizontal Scaling
- **API Servers:** Linear scaling up to 10 instances
- **Workers:** Linear scaling up to 20 instances
- **Database:** Replica set for read scaling

### Vertical Scaling
- **CPU:** Moderate impact (10-20% improvement)
- **Memory:** Minimal impact (caching only)
- **Disk:** Significant for MongoDB (30-50% improvement with SSD)

## Monitoring

### Key Metrics
- Request rate (req/s)
- Response time (p50, p95, p99)
- Error rate (%)
- Circuit breaker states
- Queue depth
- Database connections

### Alerting Thresholds
- P95 latency >1000ms
- Error rate >5%
- Queue depth >1000
- Circuit breaker open >1 minute

## Optimization Recommendations

1. **Enable MongoDB indexes** on query fields
2. **Increase connection pool sizes** for high load
3. **Implement caching** for policy lookups
4. **Use batch operations** where possible
5. **Monitor and tune GC** settings

✅ Acceptance Criteria

  • k6 installed and configured
  • Smoke test implemented
  • Load test implemented
  • Stress test implemented
  • Spike test implemented
  • Policy configuration tests implemented
  • Performance baselines established
  • Throughput measured and documented
  • Latency (p50, p95, p99) measured
  • Bottlenecks identified
  • Performance analysis script created
  • Performance characteristics documented
  • Optimization recommendations documented
  • Code follows CODING-CONVENTIONS.md

📝 Testing Instructions

# Start infrastructure
docker-compose up -d

# Wait for services to be ready
sleep 10

# Run smoke test
k6 run tests/LoadTests/smoke-test.js

# Run load test with results
k6 run --out json=results/load-test.json tests/LoadTests/load-test.js

# Analyze results
node tests/LoadTests/analyze-results.js results/load-test.json

# Run stress test
k6 run tests/LoadTests/stress-test.js

# Run policy configuration test
k6 run tests/LoadTests/policy-configuration-test.js

# Run with custom parameters
k6 run --vus 20 --duration 3m tests/LoadTests/process-creation-test.js

# Generate HTML report
k6 run --out json=results/report.json tests/LoadTests/load-test.js
open results/process-creation-summary.html

# Monitor during test
watch -n 1 'docker stats --no-stream'

# Check MongoDB performance
docker exec stargate-mongodb mongosh --eval "db.serverStatus().metrics"

# Check RabbitMQ queue depth
curl -u guest:guest http://localhost:15672/api/queues

📚 References

🏷️ Labels

phase-9 testing sprint-9.2 load-testing performance k6

⏱️ Estimated Effort

10-14 hours

🔗 Dependencies

🔗 Related Issues

Part of Phase 9: Testing & Quality - Sprint 9.2: Load Testing

📌 Important Notes

Load Test Types

Smoke Test:

  • Minimal load
  • Verify basic functionality
  • Quick feedback
  • Run before every deployment

Load Test:

  • Expected production load
  • Establish baseline
  • Measure typical performance
  • Identify normal behavior

Stress Test:

  • Beyond expected load
  • Find breaking point
  • Test resilience
  • Identify limits

Spike Test:

  • Sudden load increase
  • Test auto-scaling
  • Verify recovery
  • Real-world scenario

Performance Metrics

Throughput:

  • Requests per second
  • Higher is better
  • Indicates capacity

Latency:

  • Response time
  • Lower is better
  • Use percentiles (p50, p95, p99)
  • Don't rely on averages

Error Rate:

  • Failed requests %
  • Should be <1%
  • Monitor closely

k6 Virtual Users (VUs)

What is a VU:

  • Simulated user
  • Runs script repeatedly
  • Independent from others
  • Has own context

VU Scaling:

  • Start low (1-5 VUs)
  • Gradually increase
  • Find saturation point
  • Back off to stable load

Threshold Configuration

Conservative (Development):

thresholds: {
  http_req_duration: ['p(95)<1000'],
  http_req_failed: ['rate<0.05'],
}

Aggressive (Production):

thresholds: {
  http_req_duration: ['p(95)<300', 'p(99)<500'],
  http_req_failed: ['rate<0.01'],
}

Think Time

Why:

  • Simulates real user behavior
  • Prevents unrealistic load
  • Allows system to breathe

Typical Values:

  • API: 0.5-2 seconds
  • Web browsing: 2-5 seconds
  • Heavy computation: 5-10 seconds

Ramp-Up Strategy

Gradual:

stages: [
  { duration: '1m', target: 10 },
  { duration: '3m', target: 10 },
  { duration: '1m', target: 0 },
]
  • Safe approach
  • Observe behavior at each level
  • Identify issues early

Aggressive:

stages: [
  { duration: '10s', target: 100 },
  { duration: '1m', target: 100 },
  { duration: '10s', target: 0 },
]
  • Stress test
  • Find breaking point quickly
  • Test recovery

Resource Monitoring

During Load Tests:

# CPU and Memory
docker stats

# Disk I/O
iostat -x 1

# Network
iftop

# MongoDB
mongotop 1

# Process list
top -o %CPU

Bottleneck Identification

Database:

  • High latency
  • Connection pool exhaustion
  • Slow queries
  • Solution: Indexes, caching, read replicas

Message Broker:

  • Queue depth growing
  • Consumer lag
  • Connection timeouts
  • Solution: More consumers, larger prefetch

API:

  • High CPU usage
  • Thread pool exhaustion
  • GC pressure
  • Solution: Horizontal scaling, optimization

Performance Baseline

Purpose:

  • Reference point
  • Track improvements/regressions
  • Set realistic targets
  • Guide optimization

Update When:

  • Major code changes
  • Infrastructure changes
  • Policy configuration changes
  • After optimizations

CI/CD Integration

Automated Load Tests:

- name: Run Load Tests
  run: |
    docker-compose up -d
    sleep 10
    k6 run tests/LoadTests/smoke-test.js
    k6 run --duration 2m tests/LoadTests/load-test.js

Performance Gates:

  • P95 < 500ms
  • Success rate > 99%
  • Throughput > baseline

Results Interpretation

Good Performance:

  • Low latency (p95 < 500ms)
  • High throughput (>50 req/s)
  • Low error rate (<1%)
  • Consistent across test

Poor Performance:

  • High latency (p95 > 1000ms)
  • Low throughput (<10 req/s)
  • High error rate (>5%)
  • Degradation over time

Optimization Cycle

1. Measure baseline
   ↓
2. Identify bottleneck
   ↓
3. Implement fix
   ↓
4. Measure again
   ↓
5. Compare results
   ↓
6. Repeat

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions