Phase 9.2: Implement Load Testing with k6 and Performance Baselining

## 📋 Task Description

Implement comprehensive load testing using k6 to measure system performance under various load conditions. Create test scripts for different scenarios, establish performance baselines, identify bottlenecks, and document performance characteristics. Test with various policy configurations to understand their impact on throughput and latency.

## 🎯 Objectives

- Install and configure k6 load testing tool
- Create k6 test scripts for API endpoints
- Implement load test scenarios (smoke, load, stress, spike)
- Test with various policy configurations
- Establish performance baselines
- Measure throughput (requests/second)
- Measure latency (p50, p95, p99)
- Identify system bottlenecks
- Test database performance under load
- Test message broker performance under load
- Test resilience policy overhead
- Document performance characteristics
- Create performance monitoring dashboards

## 📦 Deliverables

### 1. Install k6

Create `tests/LoadTests/README.md`:

```markdown
# Load Testing with k6

## Installation

### macOS
```bash
brew install k6
```

### Linux
```bash
sudo gpg -k
sudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update
sudo apt-get install k6
```

### Windows
```powershell
choco install k6
```

### Docker
```bash
docker pull grafana/k6:latest
```

## Running Tests

```bash
# Run smoke test
k6 run tests/LoadTests/smoke-test.js

# Run load test
k6 run tests/LoadTests/load-test.js

# Run stress test
k6 run tests/LoadTests/stress-test.js

# Run with custom duration
k6 run --duration 5m tests/LoadTests/load-test.js

# Run with custom VUs
k6 run --vus 50 tests/LoadTests/load-test.js

# Output results to file
k6 run --out json=results.json tests/LoadTests/load-test.js
```
```

### 2. Create Base Test Configuration

Create `tests/LoadTests/config.js`:

```javascript
// Base configuration for k6 tests
export const BASE_URL = __ENV.BASE_URL || 'http://localhost:5000';

export const THRESHOLDS = {
  // HTTP response time thresholds
  http_req_duration: ['p(95)<500', 'p(99)<1000'],
  
  // HTTP request rate
  http_reqs: ['rate>10'],
  
  // HTTP failure rate
  http_req_failed: ['rate<0.01'], // Less than 1% failure
  
  // Checks success rate
  checks: ['rate>0.95'], // 95% success rate
};

export const SCENARIOS = {
  smoke: {
    executor: 'constant-vus',
    vus: 1,
    duration: '30s',
  },
  load: {
    executor: 'ramping-vus',
    startVUs: 0,
    stages: [
      { duration: '1m', target: 10 },
      { duration: '3m', target: 10 },
      { duration: '1m', target: 0 },
    ],
    gracefulRampDown: '30s',
  },
  stress: {
    executor: 'ramping-vus',
    startVUs: 0,
    stages: [
      { duration: '2m', target: 10 },
      { duration: '2m', target: 50 },
      { duration: '2m', target: 100 },
      { duration: '2m', target: 0 },
    ],
    gracefulRampDown: '30s',
  },
  spike: {
    executor: 'ramping-vus',
    startVUs: 0,
    stages: [
      { duration: '10s', target: 5 },
      { duration: '10s', target: 100 }, // Spike
      { duration: '3m', target: 100 },
      { duration: '10s', target: 5 },
      { duration: '10s', target: 0 },
    ],
    gracefulRampDown: '30s',
  },
};
```

### 3. Create Process Creation Load Test

Create `tests/LoadTests/process-creation-test.js`:

```javascript
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Counter, Trend } from 'k6/metrics';
import { BASE_URL, THRESHOLDS, SCENARIOS } from './config.js';

// Custom metrics
const processCreationErrors = new Counter('process_creation_errors');
const processCreationDuration = new Trend('process_creation_duration');

export const options = {
  scenarios: {
    load_test: SCENARIOS.load,
  },
  thresholds: THRESHOLDS,
};

let processCounter = 0;

export default function () {
  const clientId = `load-test-client-${__VU}`;
  const processType = 'order';
  const clientProcessId = `process-${__ITER}-${Date.now()}-${processCounter++}`;

  const payload = JSON.stringify({
    clientId: clientId,
    processType: processType,
    clientProcessId: clientProcessId,
    metadata: {
      orderId: `order-${__ITER}`,
      customerId: `customer-${__VU}`,
      amount: '100.00',
    },
  });

  const params = {
    headers: {
      'Content-Type': 'application/json',
    },
    tags: {
      name: 'CreateProcess',
    },
  };

  const startTime = Date.now();
  const response = http.post(`${BASE_URL}/api/processes`, payload, params);
  const duration = Date.now() - startTime;

  processCreationDuration.add(duration);

  const checkResult = check(response, {
    'status is 201': (r) => r.status === 201,
    'has location header': (r) => r.headers['Location'] !== undefined,
    'response time < 500ms': (r) => duration < 500,
  });

  if (!checkResult) {
    processCreationErrors.add(1);
    console.error(`Failed to create process: ${response.status} - ${response.body}`);
  }

  // Extract processId from location header
  if (response.status === 201) {
    const location = response.headers['Location'];
    const processId = location.split('/').pop();
    
    // Verify process was created
    sleep(0.5); // Wait for async processing
    
    const getResponse = http.get(`${BASE_URL}/api/processes/${processId}`);
    check(getResponse, {
      'process exists': (r) => r.status === 200,
      'process has valid status': (r) => {
        const process = JSON.parse(r.body);
        return ['Pending', 'Processing', 'Completed'].includes(process.status);
      },
    });
  }

  sleep(1); // Think time
}

export function handleSummary(data) {
  return {
    'results/process-creation-summary.json': JSON.stringify(data),
    'results/process-creation-summary.html': htmlReport(data),
  };
}

function htmlReport(data) {
  const metrics = data.metrics;
  return `
<!DOCTYPE html>
<html>
<head>
  <title>Process Creation Load Test Results</title>
  <style>
    body { font-family: Arial, sans-serif; margin: 20px; }
    .metric { margin: 10px 0; padding: 10px; background: #f0f0f0; }
    .success { color: green; }
    .failure { color: red; }
  </style>
</head>
<body>
  <h1>Process Creation Load Test Results</h1>
  <div class="metric">
    <h3>Requests</h3>
    <p>Total: ${metrics.http_reqs.values.count}</p>
    <p>Rate: ${metrics.http_reqs.values.rate.toFixed(2)} req/s</p>
  </div>
  <div class="metric">
    <h3>Response Time</h3>
    <p>Avg: ${metrics.http_req_duration.values.avg.toFixed(2)} ms</p>
    <p>P95: ${metrics.http_req_duration.values['p(95)'].toFixed(2)} ms</p>
    <p>P99: ${metrics.http_req_duration.values['p(99)'].toFixed(2)} ms</p>
  </div>
  <div class="metric">
    <h3>Success Rate</h3>
    <p class="${metrics.checks.values.rate > 0.95 ? 'success' : 'failure'}">
      ${(metrics.checks.values.rate * 100).toFixed(2)}%
    </p>
  </div>
</body>
</html>
  `;
}
```

### 4. Create Smoke Test

Create `tests/LoadTests/smoke-test.js`:

```javascript
import http from 'k6/http';
import { check, sleep } from 'k6';
import { BASE_URL } from './config.js';

export const options = {
  vus: 1,
  duration: '30s',
  thresholds: {
    http_req_duration: ['p(99)<1000'],
    http_req_failed: ['rate<0.01'],
  },
};

export default function () {
  // Test health endpoint
  let response = http.get(`${BASE_URL}/health`);
  check(response, {
    'health check is 200': (r) => r.status === 200,
  });

  sleep(1);

  // Test process creation
  const payload = JSON.stringify({
    clientId: 'smoke-test-client',
    processType: 'order',
    clientProcessId: `smoke-${Date.now()}`,
    metadata: {
      orderId: 'smoke-order-1',
      customerId: 'smoke-customer-1',
      amount: '50.00',
    },
  });

  response = http.post(
    `${BASE_URL}/api/processes`,
    payload,
    { headers: { 'Content-Type': 'application/json' } }
  );

  check(response, {
    'process creation is 201': (r) => r.status === 201,
  });

  sleep(1);
}
```

### 5. Create Stress Test

Create `tests/LoadTests/stress-test.js`:

```javascript
import http from 'k6/http';
import { check, sleep } from 'k6';
import { BASE_URL, SCENARIOS } from './config.js';

export const options = {
  scenarios: {
    stress: SCENARIOS.stress,
  },
  thresholds: {
    http_req_duration: ['p(95)<1000', 'p(99)<2000'],
    http_req_failed: ['rate<0.05'], // Allow 5% failure under stress
  },
};

export default function () {
  const payload = JSON.stringify({
    clientId: `stress-client-${__VU}`,
    processType: 'order',
    clientProcessId: `stress-${__VU}-${__ITER}-${Date.now()}`,
    metadata: {
      orderId: `order-${__ITER}`,
      customerId: `customer-${__VU}`,
      amount: '75.00',
    },
  });

  const response = http.post(
    `${BASE_URL}/api/processes`,
    payload,
    { headers: { 'Content-Type': 'application/json' } }
  );

  check(response, {
    'status is 2xx or 429': (r) => r.status < 300 || r.status === 429,
  });

  sleep(0.5); // Shorter think time for stress
}
```

### 6. Create Policy Configuration Test

Create `tests/LoadTests/policy-configuration-test.js`:

```javascript
import http from 'k6/http';
import { check, group } from 'k6';
import { BASE_URL } from './config.js';

export const options = {
  scenarios: {
    low_retries: {
      executor: 'constant-vus',
      vus: 10,
      duration: '1m',
      tags: { policy: 'low_retries' },
      exec: 'testLowRetries',
    },
    high_retries: {
      executor: 'constant-vus',
      vus: 10,
      duration: '1m',
      startTime: '1m',
      tags: { policy: 'high_retries' },
      exec: 'testHighRetries',
    },
    short_timeout: {
      executor: 'constant-vus',
      vus: 10,
      duration: '1m',
      startTime: '2m',
      tags: { policy: 'short_timeout' },
      exec: 'testShortTimeout',
    },
  },
};

export function testLowRetries() {
  group('Low Retries Policy', () => {
    const payload = JSON.stringify({
      clientId: 'low-retry-client',
      processType: 'order',
      clientProcessId: `low-${__VU}-${__ITER}`,
      metadata: { orderId: `order-${__ITER}` },
    });

    const response = http.post(`${BASE_URL}/api/processes`, payload, {
      headers: { 'Content-Type': 'application/json' },
    });

    check(response, {
      'created with low retries': (r) => r.status === 201,
    });
  });
}

export function testHighRetries() {
  group('High Retries Policy', () => {
    const payload = JSON.stringify({
      clientId: 'high-retry-client',
      processType: 'order',
      clientProcessId: `high-${__VU}-${__ITER}`,
      metadata: { orderId: `order-${__ITER}` },
    });

    const response = http.post(`${BASE_URL}/api/processes`, payload, {
      headers: { 'Content-Type': 'application/json' },
    });

    check(response, {
      'created with high retries': (r) => r.status === 201,
    });
  });
}

export function testShortTimeout() {
  group('Short Timeout Policy', () => {
    const payload = JSON.stringify({
      clientId: 'short-timeout-client',
      processType: 'order',
      clientProcessId: `timeout-${__VU}-${__ITER}`,
      metadata: { orderId: `order-${__ITER}` },
    });

    const response = http.post(`${BASE_URL}/api/processes`, payload, {
      headers: { 'Content-Type': 'application/json' },
    });

    check(response, {
      'created with short timeout': (r) => r.status === 201,
    });
  });
}
```

### 7. Create Performance Analysis Script

Create `tests/LoadTests/analyze-results.js`:

```javascript
import { readFileSync } from 'fs';

const resultsFile = process.argv[2];
if (!resultsFile) {
  console.error('Usage: node analyze-results.js <results-file.json>');
  process.exit(1);
}

const data = JSON.parse(readFileSync(resultsFile, 'utf8'));
const metrics = data.metrics;

console.log('\n=== Performance Analysis ===\n');

// Throughput
const throughput = metrics.http_reqs.values.rate;
console.log(`Throughput: ${throughput.toFixed(2)} req/s`);

// Latency
const latency = metrics.http_req_duration.values;
console.log(`\nLatency:`);
console.log(`  Average: ${latency.avg.toFixed(2)} ms`);
console.log(`  P50: ${latency['p(50)'].toFixed(2)} ms`);
console.log(`  P95: ${latency['p(95)'].toFixed(2)} ms`);
console.log(`  P99: ${latency['p(99)'].toFixed(2)} ms`);
console.log(`  Max: ${latency.max.toFixed(2)} ms`);

// Success rate
const successRate = metrics.checks.values.rate * 100;
console.log(`\nSuccess Rate: ${successRate.toFixed(2)}%`);

// Failures
const failureRate = metrics.http_req_failed.values.rate * 100;
console.log(`Failure Rate: ${failureRate.toFixed(2)}%`);

// Thresholds
console.log('\n=== Threshold Evaluation ===\n');
for (const [name, threshold] of Object.entries(data.thresholds || {})) {
  const passed = threshold.ok ? '✓ PASS' : '✗ FAIL';
  console.log(`${passed} ${name}`);
}

// Recommendations
console.log('\n=== Recommendations ===\n');

if (latency['p(95)'] > 500) {
  console.log('⚠️  P95 latency exceeds 500ms - consider optimization');
}

if (throughput < 10) {
  console.log('⚠️  Throughput below 10 req/s - check infrastructure capacity');
}

if (failureRate > 1) {
  console.log('⚠️  Failure rate above 1% - investigate errors');
}

if (successRate > 99 && latency['p(95)'] < 300 && throughput > 50) {
  console.log('✅ Excellent performance - system is well optimized');
}

console.log('\n');
```

### 8. Create Performance Documentation

Create `docs/PERFORMANCE-CHARACTERISTICS.md`:

```markdown
# Performance Characteristics - StarGate

## Baseline Performance

### Test Environment
- **Infrastructure:** Docker Compose (local)
- **MongoDB:** 7.0
- **Redis:** 7.2
- **RabbitMQ:** 3.13
- **Hardware:** Development machine

### Process Creation

| Metric | Value | Target |
|--------|-------|--------|
| Throughput | 50 req/s | >10 req/s |
| P50 Latency | 120 ms | <200 ms |
| P95 Latency | 350 ms | <500 ms |
| P99 Latency | 800 ms | <1000 ms |
| Success Rate | 99.5% | >99% |

### Process Query

| Metric | Value | Target |
|--------|-------|--------|
| Throughput | 200 req/s | >50 req/s |
| P50 Latency | 45 ms | <100 ms |
| P95 Latency | 120 ms | <200 ms |
| P99 Latency | 250 ms | <500 ms |
| Success Rate | 99.9% | >99% |

## Load Test Scenarios

### Smoke Test
- **Duration:** 30 seconds
- **Virtual Users:** 1
- **Purpose:** Verify basic functionality

### Load Test
- **Duration:** 5 minutes
- **Virtual Users:** Ramp 0 → 10 → 10 → 0
- **Purpose:** Establish baseline performance

### Stress Test
- **Duration:** 8 minutes
- **Virtual Users:** Ramp 0 → 10 → 50 → 100 → 0
- **Purpose:** Find breaking point

### Spike Test
- **Duration:** 4 minutes
- **Virtual Users:** Spike to 100
- **Purpose:** Test recovery from sudden load

## Policy Impact on Performance

### Retry Policy
- **No Retry:** Baseline
- **3 Retries:** +2-5% latency
- **5 Retries:** +5-10% latency

### Circuit Breaker
- **Closed:** +1% latency
- **Open:** -95% latency (fail fast)

### Timeout Policy
- **30s:** Minimal impact
- **10s:** +0.5% latency
- **5s:** +1% latency

## Bottlenecks

### Identified
1. **MongoDB Write Latency:** ~50ms per insert
2. **RabbitMQ Publish:** ~10ms per message
3. **Handler Execution:** ~500ms average

### Mitigation
1. Batch database operations
2. Connection pooling
3. Async processing
4. Caching frequent queries

## Scalability

### Horizontal Scaling
- **API Servers:** Linear scaling up to 10 instances
- **Workers:** Linear scaling up to 20 instances
- **Database:** Replica set for read scaling

### Vertical Scaling
- **CPU:** Moderate impact (10-20% improvement)
- **Memory:** Minimal impact (caching only)
- **Disk:** Significant for MongoDB (30-50% improvement with SSD)

## Monitoring

### Key Metrics
- Request rate (req/s)
- Response time (p50, p95, p99)
- Error rate (%)
- Circuit breaker states
- Queue depth
- Database connections

### Alerting Thresholds
- P95 latency >1000ms
- Error rate >5%
- Queue depth >1000
- Circuit breaker open >1 minute

## Optimization Recommendations

1. **Enable MongoDB indexes** on query fields
2. **Increase connection pool sizes** for high load
3. **Implement caching** for policy lookups
4. **Use batch operations** where possible
5. **Monitor and tune GC** settings
```

## ✅ Acceptance Criteria

- [ ] k6 installed and configured
- [ ] Smoke test implemented
- [ ] Load test implemented
- [ ] Stress test implemented
- [ ] Spike test implemented
- [ ] Policy configuration tests implemented
- [ ] Performance baselines established
- [ ] Throughput measured and documented
- [ ] Latency (p50, p95, p99) measured
- [ ] Bottlenecks identified
- [ ] Performance analysis script created
- [ ] Performance characteristics documented
- [ ] Optimization recommendations documented
- [ ] Code follows CODING-CONVENTIONS.md

## 📝 Testing Instructions

```bash
# Start infrastructure
docker-compose up -d

# Wait for services to be ready
sleep 10

# Run smoke test
k6 run tests/LoadTests/smoke-test.js

# Run load test with results
k6 run --out json=results/load-test.json tests/LoadTests/load-test.js

# Analyze results
node tests/LoadTests/analyze-results.js results/load-test.json

# Run stress test
k6 run tests/LoadTests/stress-test.js

# Run policy configuration test
k6 run tests/LoadTests/policy-configuration-test.js

# Run with custom parameters
k6 run --vus 20 --duration 3m tests/LoadTests/process-creation-test.js

# Generate HTML report
k6 run --out json=results/report.json tests/LoadTests/load-test.js
open results/process-creation-summary.html

# Monitor during test
watch -n 1 'docker stats --no-stream'

# Check MongoDB performance
docker exec stargate-mongodb mongosh --eval "db.serverStatus().metrics"

# Check RabbitMQ queue depth
curl -u guest:guest http://localhost:15672/api/queues
```

## 📚 References

- [TECHNICAL-ANALYSIS.md - Phase 9.2](https://github.com/artcava/StarGate/blob/develop/docs/TECHNICAL-ANALYSIS.md)
- [k6 Documentation](https://k6.io/docs/)
- [k6 Scenarios](https://k6.io/docs/using-k6/scenarios/)
- [Performance Testing Guide](https://k6.io/docs/test-types/)
- [CODING-CONVENTIONS.md](https://github.com/artcava/StarGate/blob/main/docs/CODING-CONVENTIONS.md)

## 🏷️ Labels

`phase-9` `testing` `sprint-9.2` `load-testing` `performance` `k6`

## ⏱️ Estimated Effort

**10-14 hours**

## 🔗 Dependencies

- Phase 5: API Gateway (endpoints to test)
- Phase 8: Resilience policies (for policy impact testing)
- Issue #110, #111: Integration and E2E tests (system must be functional)

## 🔗 Related Issues

Part of Phase 9: Testing & Quality - Sprint 9.2: Load Testing

## 📌 Important Notes

### Load Test Types

**Smoke Test:**
- Minimal load
- Verify basic functionality
- Quick feedback
- Run before every deployment

**Load Test:**
- Expected production load
- Establish baseline
- Measure typical performance
- Identify normal behavior

**Stress Test:**
- Beyond expected load
- Find breaking point
- Test resilience
- Identify limits

**Spike Test:**
- Sudden load increase
- Test auto-scaling
- Verify recovery
- Real-world scenario

### Performance Metrics

**Throughput:**
- Requests per second
- Higher is better
- Indicates capacity

**Latency:**
- Response time
- Lower is better
- Use percentiles (p50, p95, p99)
- Don't rely on averages

**Error Rate:**
- Failed requests %
- Should be <1%
- Monitor closely

### k6 Virtual Users (VUs)

**What is a VU:**
- Simulated user
- Runs script repeatedly
- Independent from others
- Has own context

**VU Scaling:**
- Start low (1-5 VUs)
- Gradually increase
- Find saturation point
- Back off to stable load

### Threshold Configuration

**Conservative (Development):**
```javascript
thresholds: {
  http_req_duration: ['p(95)<1000'],
  http_req_failed: ['rate<0.05'],
}
```

**Aggressive (Production):**
```javascript
thresholds: {
  http_req_duration: ['p(95)<300', 'p(99)<500'],
  http_req_failed: ['rate<0.01'],
}
```

### Think Time

**Why:**
- Simulates real user behavior
- Prevents unrealistic load
- Allows system to breathe

**Typical Values:**
- API: 0.5-2 seconds
- Web browsing: 2-5 seconds
- Heavy computation: 5-10 seconds

### Ramp-Up Strategy

**Gradual:**
```javascript
stages: [
  { duration: '1m', target: 10 },
  { duration: '3m', target: 10 },
  { duration: '1m', target: 0 },
]
```
- Safe approach
- Observe behavior at each level
- Identify issues early

**Aggressive:**
```javascript
stages: [
  { duration: '10s', target: 100 },
  { duration: '1m', target: 100 },
  { duration: '10s', target: 0 },
]
```
- Stress test
- Find breaking point quickly
- Test recovery

### Resource Monitoring

**During Load Tests:**
```bash
# CPU and Memory
docker stats

# Disk I/O
iostat -x 1

# Network
iftop

# MongoDB
mongotop 1

# Process list
top -o %CPU
```

### Bottleneck Identification

**Database:**
- High latency
- Connection pool exhaustion
- Slow queries
- **Solution:** Indexes, caching, read replicas

**Message Broker:**
- Queue depth growing
- Consumer lag
- Connection timeouts
- **Solution:** More consumers, larger prefetch

**API:**
- High CPU usage
- Thread pool exhaustion
- GC pressure
- **Solution:** Horizontal scaling, optimization

### Performance Baseline

**Purpose:**
- Reference point
- Track improvements/regressions
- Set realistic targets
- Guide optimization

**Update When:**
- Major code changes
- Infrastructure changes
- Policy configuration changes
- After optimizations

### CI/CD Integration

**Automated Load Tests:**
```yaml
- name: Run Load Tests
  run: |
    docker-compose up -d
    sleep 10
    k6 run tests/LoadTests/smoke-test.js
    k6 run --duration 2m tests/LoadTests/load-test.js
```

**Performance Gates:**
- P95 < 500ms
- Success rate > 99%
- Throughput > baseline

### Results Interpretation

**Good Performance:**
- Low latency (p95 < 500ms)
- High throughput (>50 req/s)
- Low error rate (<1%)
- Consistent across test

**Poor Performance:**
- High latency (p95 > 1000ms)
- Low throughput (<10 req/s)
- High error rate (>5%)
- Degradation over time

### Optimization Cycle

```
1. Measure baseline
   ↓
2. Identify bottleneck
   ↓
3. Implement fix
   ↓
4. Measure again
   ↓
5. Compare results
   ↓
6. Repeat
```

Phase 9.2: Implement Load Testing with k6 and Performance Baselining #112

Description

📋 Task Description

🎯 Objectives

📦 Deliverables

1. Install k6

Linux

Windows

Docker

Running Tests

3. Create Process Creation Load Test

4. Create Smoke Test

5. Create Stress Test

6. Create Policy Configuration Test

7. Create Performance Analysis Script

8. Create Performance Documentation

✅ Acceptance Criteria

📝 Testing Instructions

📚 References

🏷️ Labels

⏱️ Estimated Effort

🔗 Dependencies

🔗 Related Issues

📌 Important Notes

Load Test Types

Performance Metrics

k6 Virtual Users (VUs)

Threshold Configuration

Think Time

Ramp-Up Strategy

Resource Monitoring

Bottleneck Identification

Performance Baseline

CI/CD Integration

Results Interpretation

Optimization Cycle

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions