Skip to content

Commit 2c97b21

Browse files
asaadbalumrootfs
andauthored
feat(e2e): add Istio service mesh integration test profile (#728)
Implement comprehensive E2E testing profile for Istio service mesh integration with Semantic Router: - Add Istio profile with 4 Istio-specific tests and 13 common tests (17 total) - Deploy Semantic Router with Istio sidecar injection and service mesh features - Integrate Envoy Gateway for ExtProc communication alongside Istio mesh capabilities - Deploy vLLM backend via Gateway API resources with AIServiceBackend CRDs - Add keyword routing support (urgent_request and sensitive_data decisions) - Fix Istio test namespace resolution to use vllm-semantic-router-system - All 17 tests passing with 100% success rate in local testing Test coverage includes: - Istio sidecar injection and health verification - Traffic routing through Istio ingress gateway - mTLS verification between services - Distributed tracing and observability - Chat completions, stress tests, and domain classification - Plugin chain execution, PII/jailbreak detection, semantic caching Signed-off-by: Asaad Balum <asaad.balum@gmail.com> Co-authored-by: Huamin Chen <rootfs@users.noreply.github.com>
1 parent e0ef1c9 commit 2c97b21

File tree

12 files changed

+1921
-11
lines changed

12 files changed

+1921
-11
lines changed

.github/workflows/integration-test-k8s.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ jobs:
2222
strategy:
2323
fail-fast: false # Continue testing other profiles even if one fails
2424
matrix:
25-
profile: [ai-gateway, aibrix, routing-strategies, llm-d]
25+
profile: [ai-gateway, aibrix, routing-strategies, llm-d, istio]
2626

2727
steps:
2828
- name: Check out the repo

e2e/README.md

Lines changed: 184 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ The framework follows a **separation of concerns** design:
1414

1515
- **ai-gateway**: Tests Semantic Router with Envoy AI Gateway integration
1616
- **aibrix**: Tests Semantic Router with vLLM AIBrix integration
17-
- **istio**: Tests Semantic Router with Istio Gateway (future)
17+
- **istio**: Tests Semantic Router with Istio service mesh integration
1818
- **production-stack**: Tests vLLM Production Stack configurations (future)
1919
- **llm-d**: Tests Semantic Router with LLM-D distributed inference
2020
- **dynamo**: Tests with Nvidia Dynamo (future)
@@ -517,3 +517,186 @@ func (p *Profile) GetServiceConfig() framework.ServiceConfig {
517517
```
518518

519519
See `profiles/ai-gateway/` for a complete example.
520+
521+
## Profile Details
522+
523+
### Istio Profile
524+
525+
The Istio profile tests Semantic Router deployment and functionality in an Istio service mesh environment. It validates both Istio-specific features (sidecars, mTLS, tracing) and general Semantic Router functionality through Istio Gateway + VirtualService routing.
526+
527+
**What it Tests:**
528+
529+
- **Istio-Specific Features:**
530+
- Istio sidecar injection and health
531+
- Traffic routing through Istio ingress gateway
532+
- Mutual TLS (mTLS) between services
533+
- Distributed tracing and observability
534+
535+
- **Semantic Router Features (through Istio):**
536+
- Chat completions API and stress testing
537+
- Domain classification and routing
538+
- Semantic cache, PII detection, jailbreak detection
539+
- Signal-Decision engine (priority, plugins, keywords, fallback)
540+
541+
**Prerequisites:**
542+
543+
- Docker and Kind (managed by E2E framework)
544+
- Helm (for installing Istio components)
545+
546+
**Components Deployed:**
547+
548+
1. **Istio Control Plane** (`istio-system` namespace):
549+
- `istiod` - Istio control plane
550+
- `istio-ingressgateway` - Ingress gateway for external traffic
551+
552+
2. **Semantic Router** (`semantic-router` namespace):
553+
- Deployed via Helm with Istio sidecar injection enabled
554+
- Namespace labeled with `istio-injection=enabled`
555+
556+
3. **Istio Resources**:
557+
- `Gateway` - Configures ingress gateway on port 80
558+
- `VirtualService` - Routes traffic to Semantic Router service
559+
- `DestinationRule` - Enables mTLS with `ISTIO_MUTUAL` mode
560+
561+
**Test Cases:**
562+
563+
**Istio-Specific Tests (4):**
564+
565+
| Test Case | Description | What it Validates |
566+
|-----------|-------------|-------------------|
567+
| `istio-sidecar-health-check` | Verify Envoy sidecar injection | - Istio-proxy container exists<br>- Sidecar is healthy and ready<br>- Namespace has `istio-injection=enabled` label |
568+
| `istio-traffic-routing` | Test routing through Istio gateway | - Gateway and VirtualService exist<br>- Requests route correctly to Semantic Router<br>- Istio/Envoy headers present in responses |
569+
| `istio-mtls-verification` | Verify mutual TLS configuration | - DestinationRule has `ISTIO_MUTUAL` mode<br>- mTLS certificates present in istio-proxy<br>- PeerAuthentication policy (if configured) |
570+
| `istio-tracing-observability` | Check distributed tracing and metrics | - Trace headers propagated<br>- Envoy metrics exposed<br>- Telemetry configuration<br>- Access logs enabled |
571+
572+
**Common Functionality Tests (through Istio Gateway):**
573+
574+
These tests validate that Semantic Router features work correctly when routed through Istio Gateway and VirtualService:
575+
576+
- `chat-completions-request` - Basic API functionality
577+
- `chat-completions-stress-request` - Sequential stress (1000 requests)
578+
- `domain-classify` - Classification accuracy (65 cases)
579+
- `semantic-cache` - Cache hit rate (5 groups)
580+
- `pii-detection` - PII detection and blocking (10 types)
581+
- `jailbreak-detection` - Attack detection (10 types)
582+
- `decision-priority-selection` - Priority-based routing (4 cases)
583+
- `plugin-chain-execution` - Plugin ordering (4 cases)
584+
- `rule-condition-logic` - AND/OR operators (6 cases)
585+
- `decision-fallback-behavior` - Fallback handling (5 cases)
586+
- `keyword-routing` - Keyword matching (6 cases)
587+
- `plugin-config-variations` - Config variations (6 cases)
588+
- `chat-completions-progressive-stress` - Progressive QPS stress test
589+
590+
**Total: 17 test cases** (4 Istio-specific + 13 common functionality)
591+
592+
**Usage:**
593+
594+
```bash
595+
# Run all Istio tests
596+
make e2e-test E2E_PROFILE=istio
597+
598+
# Run specific Istio tests
599+
make e2e-test-specific E2E_PROFILE=istio E2E_TESTS="istio-sidecar-health-check,istio-mtls-verification"
600+
601+
# Run with verbose output
602+
./bin/e2e -profile istio -verbose
603+
604+
# Keep cluster for debugging
605+
make e2e-test E2E_PROFILE=istio E2E_KEEP_CLUSTER=true
606+
```
607+
608+
**Architecture:**
609+
610+
```
611+
┌─────────────────────────────────────────┐
612+
│ Istio Ingress Gateway │
613+
│ (istio-system namespace) │
614+
│ Port 80 → semantic-router service │
615+
└────────────┬────────────────────────────┘
616+
617+
618+
┌─────────────────────────────────────────┐
619+
│ Semantic Router Pod │
620+
│ (semantic-router namespace) │
621+
│ ┌─────────────┐ ┌──────────────────┐ │
622+
│ │ Main │ │ Istio-Proxy │ │
623+
│ │ Container │◄─┤ (Envoy Sidecar) │ │
624+
│ │ │ │ │ │
625+
│ │ :8801 │ │ mTLS, Tracing │ │
626+
│ └─────────────┘ └──────────────────┘ │
627+
└─────────────────────────────────────────┘
628+
629+
630+
┌─────────────────────────────────────────┐
631+
│ Istiod (Control Plane) │
632+
│ - Config distribution │
633+
│ - Certificate management (mTLS) │
634+
│ - Sidecar injection │
635+
└─────────────────────────────────────────┘
636+
```
637+
638+
**Key Features Tested:**
639+
640+
**Istio Integration:**
641+
642+
-**Automatic Sidecar Injection**: Istio automatically injects Envoy proxy sidecars into pods
643+
-**Traffic Management**: Requests route through Istio Gateway → VirtualService → Semantic Router
644+
-**Security (mTLS)**: Automatic mutual TLS encryption and authentication between services
645+
-**Observability**: Distributed tracing, metrics collection, and access logs
646+
-**Service Mesh Integration**: Semantic Router operates correctly within Istio mesh
647+
648+
**Test Coverage:**
649+
650+
Istio-Specific Tests (4):
651+
652+
-**istio-sidecar-health-check**: Validates sidecar injection and health
653+
-**istio-traffic-routing**: Tests routing through Gateway and VirtualService
654+
-**istio-mtls-verification**: Confirms mTLS configuration and certificates
655+
-**istio-tracing-observability**: Validates distributed tracing and metrics
656+
657+
Common Functionality Tests (13):
658+
659+
-**Chat Completions**: API functionality and stress testing
660+
-**Classification**: Domain-based routing with 65 test cases
661+
-**Security Features**: PII detection, jailbreak detection, semantic cache
662+
-**Signal-Decision Engine**: Priority routing, plugin chains, keyword matching, fallback behavior
663+
-**Load Handling**: Progressive stress testing (10-100 QPS)
664+
665+
**Total: 17 comprehensive test cases validating both Istio integration and Semantic Router functionality through the service mesh**
666+
667+
**Setup Steps (Automated by Profile):**
668+
669+
1. Install Istio control plane using Helm (base, istiod, ingress gateway)
670+
2. Create namespace with `istio-injection=enabled` label
671+
3. Deploy Semantic Router via Helm (sidecar auto-injected)
672+
4. Create Istio Gateway and VirtualService for traffic routing
673+
5. Create DestinationRule for mTLS configuration
674+
6. Verify all components are ready
675+
676+
**Troubleshooting:**
677+
678+
If tests fail, check:
679+
680+
```bash
681+
# Check Istio installation
682+
kubectl get pods -n istio-system
683+
684+
# Check sidecar injection
685+
kubectl get pods -n semantic-router -o jsonpath='{.items[*].spec.containers[*].name}'
686+
687+
# Check Istio resources
688+
kubectl get gateway,virtualservice,destinationrule -n semantic-router
689+
690+
# Check mTLS configuration
691+
kubectl get destinationrule semantic-router -n semantic-router -o yaml
692+
693+
# View Istio proxy logs
694+
kubectl logs -n semantic-router <pod-name> -c istio-proxy
695+
```
696+
697+
**Related Resources:**
698+
699+
- [Istio Documentation](https://istio.io/latest/docs/)
700+
- [Istio Traffic Management](https://istio.io/latest/docs/concepts/traffic-management/)
701+
- [Istio Security (mTLS)](https://istio.io/latest/docs/concepts/security/)
702+
- [Istio Observability](https://istio.io/latest/docs/concepts/observability/)

e2e/cmd/e2e/main.go

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,12 +12,14 @@ import (
1212
aigateway "github.com/vllm-project/semantic-router/e2e/profiles/ai-gateway"
1313
aibrix "github.com/vllm-project/semantic-router/e2e/profiles/aibrix"
1414
dynamicconfig "github.com/vllm-project/semantic-router/e2e/profiles/dynamic-config"
15+
istio "github.com/vllm-project/semantic-router/e2e/profiles/istio"
1516
llmd "github.com/vllm-project/semantic-router/e2e/profiles/llm-d"
1617
routingstrategies "github.com/vllm-project/semantic-router/e2e/profiles/routing-strategies"
1718

1819
// Import profiles to register test cases
1920
_ "github.com/vllm-project/semantic-router/e2e/profiles/ai-gateway"
2021
_ "github.com/vllm-project/semantic-router/e2e/profiles/aibrix"
22+
_ "github.com/vllm-project/semantic-router/e2e/profiles/istio"
2123
_ "github.com/vllm-project/semantic-router/e2e/profiles/llm-d"
2224
_ "github.com/vllm-project/semantic-router/e2e/profiles/routing-strategies"
2325
)
@@ -107,13 +109,12 @@ func getProfile(name string) (framework.Profile, error) {
107109
return dynamicconfig.NewProfile(), nil
108110
case "aibrix":
109111
return aibrix.NewProfile(), nil
112+
case "istio":
113+
return istio.NewProfile(), nil
110114
case "llm-d":
111115
return llmd.NewProfile(), nil
112116
case "routing-strategies":
113117
return routingstrategies.NewProfile(), nil
114-
// Add more profiles here as they are implemented
115-
// case "istio":
116-
// return istio.NewProfile(), nil
117118
default:
118119
return nil, fmt.Errorf("unknown profile: %s", name)
119120
}

e2e/pkg/helpers/kubernetes.go

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,22 +38,28 @@ func CheckDeployment(ctx context.Context, client *kubernetes.Clientset, namespac
3838

3939
// GetEnvoyServiceName finds the Envoy service name in the envoy-gateway-system namespace
4040
// using label selectors to match the Gateway-owned service
41+
// Deprecated: Use GetServiceByLabelInNamespace for more flexibility
4142
func GetEnvoyServiceName(ctx context.Context, client *kubernetes.Clientset, labelSelector string, verbose bool) (string, error) {
42-
services, err := client.CoreV1().Services("envoy-gateway-system").List(ctx, metav1.ListOptions{
43+
return GetServiceByLabelInNamespace(ctx, client, "envoy-gateway-system", labelSelector, verbose)
44+
}
45+
46+
// GetServiceByLabelInNamespace finds a service by label selector in a specific namespace
47+
func GetServiceByLabelInNamespace(ctx context.Context, client *kubernetes.Clientset, namespace string, labelSelector string, verbose bool) (string, error) {
48+
services, err := client.CoreV1().Services(namespace).List(ctx, metav1.ListOptions{
4349
LabelSelector: labelSelector,
4450
})
4551
if err != nil {
4652
return "", fmt.Errorf("failed to list services with selector %s: %w", labelSelector, err)
4753
}
4854

4955
if len(services.Items) == 0 {
50-
return "", fmt.Errorf("no service found with selector %s in envoy-gateway-system namespace", labelSelector)
56+
return "", fmt.Errorf("no service found with selector %s in %s namespace", labelSelector, namespace)
5157
}
5258

5359
// Return the first matching service (should only be one)
5460
serviceName := services.Items[0].Name
5561
if verbose {
56-
fmt.Printf("[Helper] Found Envoy service: %s (matched by labels: %s)\n", serviceName, labelSelector)
62+
fmt.Printf("[Helper] Found service: %s (matched by labels: %s in namespace: %s)\n", serviceName, labelSelector, namespace)
5763
}
5864

5965
return serviceName, nil

e2e/profiles/ai-gateway/values.yaml

Lines changed: 53 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -383,6 +383,48 @@ config:
383383
system_prompt: "You are a thinking expert, should think multiple steps before answering. Please answer the question step by step."
384384
mode: "replace"
385385

386+
- name: urgent_request
387+
description: "Urgent requests requiring immediate attention"
388+
priority: 30
389+
rules:
390+
operator: "OR"
391+
conditions:
392+
- type: "keyword"
393+
name: "urgent_keywords"
394+
modelRefs:
395+
- model: base-model
396+
lora_name: general-expert
397+
use_reasoning: false
398+
plugins:
399+
- type: "system_prompt"
400+
configuration:
401+
enabled: true
402+
system_prompt: "You are handling an urgent request. Prioritize quick and direct responses."
403+
mode: "replace"
404+
405+
- name: sensitive_data
406+
description: "Queries containing sensitive data keywords (SSN and credit card)"
407+
priority: 40
408+
rules:
409+
operator: "AND"
410+
conditions:
411+
- type: "keyword"
412+
name: "sensitive_keywords"
413+
modelRefs:
414+
- model: base-model
415+
lora_name: general-expert
416+
use_reasoning: false
417+
plugins:
418+
- type: "pii"
419+
configuration:
420+
enabled: true
421+
pii_types_allowed: []
422+
- type: "system_prompt"
423+
configuration:
424+
enabled: true
425+
system_prompt: "You are handling a query with sensitive data. Be cautious and provide security-focused guidance."
426+
mode: "replace"
427+
386428
- name: other_decision
387429
description: "General knowledge and miscellaneous topics"
388430
priority: 1
@@ -478,7 +520,17 @@ config:
478520
keyword_rules:
479521
- name: "thinking"
480522
operator: "OR"
481-
keywords: ["urgent", "immediate", "asap", "think", "careful"]
523+
keywords: ["think", "careful"]
524+
case_sensitive: false
525+
526+
- name: "urgent_keywords"
527+
operator: "OR"
528+
keywords: ["urgent", "immediate", "asap", "emergency"]
529+
case_sensitive: false
530+
531+
- name: "sensitive_keywords"
532+
operator: "AND"
533+
keywords: ["SSN", "credit card"]
482534
case_sensitive: false
483535

484536

0 commit comments

Comments
 (0)