Skip to content

Add E2E Testing #117#171

Merged
ofilson merged 8 commits intodevelopfrom
e2e-testing-117
Apr 21, 2026
Merged

Add E2E Testing #117#171
ofilson merged 8 commits intodevelopfrom
e2e-testing-117

Conversation

@ofilson
Copy link
Copy Markdown
Collaborator

@ofilson ofilson commented Apr 21, 2026

Implement Playwright E2E Testing for Frontend (#117)

Summary

Adds a comprehensive Playwright E2E test suite for the Angular frontend and integrates it into the nightly CI pipeline as a new e2e track.

What changed

E2E Test Suite (16 spec files, ~3,500 lines)

Organized by feature area under frontend/ai.client/e2e/:

  • auth/ — Login flow (Cognito managed UI), navigation guards, 404 handling, admin access checks
  • home-page/ — Chat send/receive, model selector, settings panel, file upload UI, error handling
  • assistants/ — Assistant CRUD lifecycle (create, edit, duplicate, delete), assistant selection in chat
  • settings/ — Profile page, appearance (theme/font toggles), chat preferences (all toggles + persistence), usage dashboard with date filters
  • manage-sessions.user.spec.ts — Conversation selection, bulk delete, share lifecycle (create → verify share link → manage shared instances → delete)

Playwright Configuration

  • playwright.config.ts — Local development config that starts backend APIs + Angular dev server automatically
  • playwright.ci.config.ts — CI config that runs against an already-deployed nightly stack via E2E_BASE_URL
  • Auth setup projects (auth-admin.setup.ts, auth-user.setup.ts) that log in via Cognito and persist storage state for authenticated test projects

Nightly Pipeline Integration

  • New e2e-<branch> track in nightly.yml (e.g. e2e-develop)
  • nightly-deploy-pipeline.yml gains a run-e2e input and an e2e-tests job that runs after smoke test
  • scripts/nightly/e2e-test.sh — Resolves the deployed stack URL, installs Playwright browsers, runs the suite
  • E2E failures are informational — they mark the nightly summary as "partial" rather than "failed"
  • Playwright HTML report + screenshots + traces uploaded as playwright-report-<label> artifact (30-day retention)
  • all track now includes E2E by default

Not included

  • Firefox/WebKit browser coverage (Chromium only for now)
  • Debug output toggle tests (commented out, pending UI stabilization)
  • File lifecycle tests in profile (commented out, pending upload flow stabilization)

How to test locally

cd frontend/ai.client

# Create e2e/.env with Cognito test credentials
cp e2e/.env.example e2e/.env
# Fill in ADMIN_USERNAME, ADMIN_PASSWORD, USER_USERNAME, USER_PASSWORD

# Run with local servers (starts backend + frontend automatically)
npx playwright test

# View the HTML report
npx playwright show-report

Oscar Filson and others added 7 commits April 9, 2026 11:30
* test(session): update compaction config defaults and fix integration test skipping

- Enable compaction by default in CompactionConfig
- Increase protected_turns default from 2 to 3
- Add pytest marker to skip integration tests when AGENTCORE_MEMORY_ID is not set
- Fix import path for get_metadata_storage in cache savings tests from metadata_storage.get_metadata_storage to storage.get_metadata_storage
- Ensures integration tests only run in appropriate environments with required AWS credentials

* test(session): enhance session manager fixture with initialize() mock and cleanup

- Mock AgentCoreMemorySessionManager.initialize() to simulate SDK behavior
- Add _mock_sdk_initialize shim that loads messages and validates agent uniqueness
- Track active patches in fixture scope for proper cleanup on teardown
- Update fixture docstring to document initialize() mocking and message control
- Convert fixture to generator with yield to enable patch cleanup
- Allow tests to control loaded messages via mgr.read_agent and mgr.list_messages

* Release 1.0.0-beta.22: Cognito-native auth, CORS unification, RBAC consolidation, Trivy supply chain fix (#137)

⚠️ BREAKING CHANGE: Authentication replaced with AWS Cognito.
The legacy generic OIDC implementation has been removed with no
backward compatibility layer. Existing deployments must re-bootstrap.

Cognito First-Boot Authentication:
- Cognito User Pool, App Client, and Domain provisioned in Infrastructure stack
- CognitoJWTValidator replaces GenericOIDCJWTValidator
- New system/ module for first-boot setup, Cognito user/group management
- New cognito_idp_service for federated identity provider CRUD via Cognito IdP APIs
- First-boot page with admin account creation (race-condition-safe DynamoDB writes)
- Frontend auth flow rewritten for Cognito OAuth 2.0 + PKCE
- Runtime-provisioner and runtime-updater Lambda functions removed (2,800+ lines)
- Backend OIDC service, token exchange, and discovery endpoints removed (1,318 lines)
- 2,057 lines of new Cognito test coverage (IdP service, JWT validator, first-boot, system)

RBAC Consolidation:
- Single require_app_roles dependency replaces 6 role-checking functions/decorators
- User roles enriched from stored DynamoDB profile during token processing
- Profile cache invalidation on sync for immediate role updates
- JSON array parsing for custom:roles claim (Entra ID compatibility)
- jwt_role_mappings updates allowed on system_admin role

CORS Unification:
- buildCorsOrigins() shared helper across all 6 CDK stacks
- S3 CORS made conditional, ExposedHeaders→ExposeHeaders fix
- Python APIs read CORS_ORIGINS env var (replaces allow_origins=['*'])

Security:
- Trivy action upgraded v0.28.0→v0.35.0 — old SHA was compromised in
  March 2026 supply chain attack (GHSA-69fq-xp46-6x23)

CI/CD:
- CDK_DOMAIN_NAME and CDK_CORS_ORIGINS added to all workflow jobs
- App API synth-cdk actually skipped on PRs (guard was missing despite beta.20 docs)
- SSM StringParameter creation guarded against empty values

Bootstrap:
- seed_bootstrap_data.py sole owner of RBAC role seeding (removed from app startup)
- system_admin role seeded with jwt_role_mappings=['system_admin']
- Additive JWT mapping seeding for existing deployments

Documentation:
- 54,665 lines of outdated specs and AI artifacts purged (121 files)

Dependencies:
- Python: fastapi 0.135.3, uvicorn 0.44.0, boto3 1.42.83, strands-agents 1.34.1,
  bedrock-agentcore 1.6.0, google-genai 1.70.0, ruff 0.15.9, mypy 1.20.0
- Frontend: Angular 21.2.7, katex 0.16.45, mermaid 11.14.0, Analog.js alpha.26
- Infrastructure: aws-cdk-lib 2.248.0, aws-cdk 2.1117.0, ts-jest 29.4.9

* refactor: centralize env vars and magic strings into config/constants.py (#139)

Create agents/main_agent/config/constants.py with EnvVars, Defaults, and
Prefixes classes. Update all 13 modules to import from the centralized
constants instead of using inline os.getenv() with hardcoded strings.

This eliminates scattered magic strings and provides a single reference
for all configuration. Zero behavior change — all values are identical.

543/543 tests passing.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: extract BaseAgent ABC and ChatAgent from MainAgent (#140)

* refactor: centralize env vars and magic strings into config/constants.py

Create agents/main_agent/config/constants.py with EnvVars, Defaults, and
Prefixes classes. Update all 13 modules to import from the centralized
constants instead of using inline os.getenv() with hardcoded strings.

This eliminates scattered magic strings and provides a single reference
for all configuration. Zero behavior change — all values are identical.

543/543 tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: extract BaseAgent ABC and ChatAgent from MainAgent

Split MainAgent into a three-tier hierarchy:
- BaseAgent (ABC): shared init for model config, tools, session, streaming
- ChatAgent(BaseAgent): Strands Agent creation and text streaming
- MainAgent(ChatAgent): backward-compatible alias (pass-through)

All existing callers continue to import and use MainAgent unchanged.
The _build_filtered_tools() helper is extracted from _create_agent() for
reuse by future agent types (SkillAgent, VoiceAgent).

543/543 tests passing — zero behavior change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add agent type registry and create_agent() factory (#141)

Introduce agent_types.py with a pluggable registry pattern:
- create_agent(agent_type, **kwargs) → BaseAgent subclass
- register_agent_type(name, cls) for dynamic registration
- ChatAgent registered as "chat" by default

Future agent types (skill, voice) will register themselves here.
Existing code is unchanged — MainAgent still works as before.

552/552 tests passing (9 new factory tests).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add progressive skill disclosure system with SkillAgent (#142)

Implement three-level skill architecture adapted from sample-strands-agent:
- Level 1: Lightweight skill catalog injected into system prompt
- Level 2: SKILL.md instructions loaded on-demand via skill_dispatcher
- Level 3: Tool execution via skill_executor

New modules:
- skills/skill_registry.py: Discovers SKILL.md files, binds tools, serves catalog
- skills/skill_tools.py: skill_dispatcher + skill_executor Strands @tool functions
- skills/decorators.py: @Skill() decorator and register_skill() for tool tagging
- skill_agent.py: SkillAgent(ChatAgent) with progressive disclosure override
- skills/definitions/web-search/SKILL.md: Example skill definition

SkillAgent registered as "skill" in agent_types factory.
Existing behavior completely unchanged — SkillAgent is additive only.

590/590 tests passing (38 new skill tests).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add VoiceAgent with BidiAgent for speech-to-speech interaction (#143)

Implement VoiceAgent(BaseAgent) for bidirectional voice using Nova Sonic 2:
- BidiNovaSonicModel with configurable voice, sample rate, and model
- Voice-text continuity via _load_text_history() from text session
- Separate agent_id ("voice") to prevent session state conflicts
- Voice-optimized system prompt with conversational guidelines
- PyAudio mock for server-side (browser uses Web Audio API)
- Conditional registration — only available with strands-agents[bidi]

Add voice-related constants to config/constants.py (EnvVars + Defaults).
Register "voice" type in agent_types factory.

606/606 tests passing (16 new voice tests).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add approval hooks for gating dangerous tool operations (#144)

Implement three approval hook categories following the sample-strands-agent
pattern, all using Strands BeforeToolCallEvent:

- EmailApprovalHook: Gates send_email, delete_emails, forward_email, etc.
- ExternalWriteApprovalHook: Gates create_pull_request, deploy, push_code, etc.
- DangerousToolApprovalHook: Gates delete_file, drop_table, execute_sql, etc.

Hooks set _approval_required/_approval_message on the tool_use dict for
the streaming layer to surface to the client for user confirmation.

All hooks registered in BaseAgent._create_hooks() — inherited by all
agent types (ChatAgent, SkillAgent, VoiceAgent).

618/618 tests passing (12 new approval hook tests).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add WebSocket voice route and bidi dependency for VoiceAgent (#145)

* feat: add bidi dependency, WebSocket voice route, and test client

Wire up the VoiceAgent for end-to-end testing:

- Add strands-agents[bidi] optional dependency group to pyproject.toml
- Fix BidiAgent/BidiNovaSonicModel import paths (strands.experimental.bidi)
- Create voice_routes.py with WebSocket endpoint at /voice/stream
  - JWT auth from query params (trusted decode, same as invocations)
  - Bidirectional protocol: audio/text input, agent event streaming
  - Debug endpoints: GET /voice/sessions, DELETE /voice/sessions/{id}
- Register voice router in inference API main.py
- Add test_voice_client.py script for manual WebSocket testing

632/632 tests passing (14 new voice route tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: handle CancelledError in VoiceAgent.stop() during teardown

The BidiAgent's Nova Sonic stream teardown can raise CancelledError
when pending AWS SDK futures are cancelled during shutdown. This is
expected behavior, not an error.

- VoiceAgent.stop(): catch CancelledError and Exception from BidiAgent
- voice_routes.py finally block: catch BaseException (CancelledError
  is a BaseException in Python 3.12, escaping except Exception)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: pass session_id and agent_id to list_messages in voice history

AgentCoreMemorySessionManager.list_messages() requires session_id and
agent_id positional args. Pass session_id=self.session_id and
agent_id="default" to read the text chat agent's history for
voice-text continuity. Use the SDK's limit param instead of
post-slicing.

Update tests to verify the correct call signature.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use BidiAgent.receive() for voice event streaming

BidiAgent uses receive() as its event source, not stream_async().
Audio/text input is sent via send_audio()/send_text() separately,
and receive() yields typed events (BidiAudioStreamEvent,
BidiTranscriptStreamEvent, etc.) asynchronously.

- VoiceAgent.stream_async(): iterate BidiAgent.receive(), yield
  event.as_dict() for JSON-serializable dicts
- voice_routes._send_to_client(): simplified to handle dicts directly
  since stream_async now yields dicts, not strings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Angular voice components for Nova Sonic bidirectional audio

Frontend voice support with three-layer architecture:

New services (frontend/ai.client/src/app/session/services/voice/):
- pcm-utils.ts: Pure PCM encoding/decoding (Float32↔Int16↔base64)
- AudioRecorderService: Mic capture via Web Audio API → 16kHz PCM chunks
- AudioPlayerService: Gapless base64 PCM playback with interruption support
- VoiceChatService: WebSocket orchestration + state machine
  (idle → connecting → listening → speaking)

Modified components:
- chat-input: Voice toggle button with animated state indicators
  (pulsing red = listening, bouncing green = speaking, spinner = connecting)
- chat-input template: Live transcript overlay during voice mode
- session.page.ts: Wire voice response completions to message list
- MessageMapService: addVoiceMessage() for finalized voice transcripts

TypeScript compiles cleanly (tsc --noEmit). Angular build requires
Node 20.19+ (current machine has 20.18.1).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: convert SessionMessage to dict for BidiAgent and fix TS2774

Backend: _load_text_history() now calls .to_dict() on SessionMessage
objects before passing to BidiAgent. Nova Sonic expects plain dicts
with {"role": "...", "content": [...]}, not SessionMessage objects.

Frontend: Fix TS2774 in AudioRecorderService — use typeof check
instead of truthiness check for getUserMedia function detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use to_message() instead of to_dict() for BidiAgent history

SessionMessage.to_dict() wraps the message in metadata:
  {"message": {"role": ..., "content": [...]}, "message_id": 0, ...}

SessionMessage.to_message() returns the plain message dict:
  {"role": "user", "content": [...]}

Nova Sonic's _get_message_history_events expects the plain format.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use BidiAgent.send() and receive() APIs correctly

BidiAgent has send(dict) and receive() — not send_audio()/send_text()
or stream_async(). Align VoiceAgent methods with the actual SDK:

- send_audio(): calls self._bidi_agent.send({"type": "bidi_audio_input", ...})
- send_text(): calls self._bidi_agent.send({"type": "bidi_text_input", ...})
- receive_events(): wraps self._bidi_agent.receive() with as_dict() conversion
- stream_async(): now a no-op stub (voice uses receive_events() instead)

Update voice_routes._send_to_client to call receive_events() not stream_async().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Implement feature X to enhance user experience and optimize performance

* feat: add voice overlay component for voice interactions

- Implemented VoiceOverlayComponent with HTML, CSS, and TypeScript files.
- Added styles for visualizer orb and status badges using Tailwind CSS.
- Integrated voice status management and session handling in the component.
- Enhanced voice chat service to support transcript entries and reveal logic.
- Updated session page to handle voice overlay closure and persist transcripts as messages.
- Introduced configuration constants for voice processing parameters.

* feat: enhance voice agent with real-time cost calculation and metadata handling

* fix: refine token usage handling and improve message processing in voice components

* fix: sanitize user-provided values in log statements to prevent log injection

Addresses CodeQL alert #567 (py/log-injection). All user-provided values
(session_id, user_id, msg_type, enabled_tools) are now passed through
_sanitize_log() which strips newline and carriage return characters before
being interpolated into log messages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: WebSocket voice streaming with AgentCore auth support (#155)

* feat: update WebSocket voice streaming endpoint for AgentCore compatibility

* fix: ensure config message is required for WebSocket voice stream authentication

* feat: WebSocket voice streaming with AgentCore auth and protocol config (#156)

* feat: update WebSocket voice streaming endpoint for AgentCore compatibility

* fix: ensure config message is required for WebSocket voice stream authentication

* feat: add protocol configuration for HTTP support in InferenceApiStack

* fix: include bidi dependency in uv sync commands for Inference API Dockerfile (#157)

* fix: improve AgentCore connection detection in voice stream handling (#158)

* fix: align voice WebSocket with reference architecture accept-first pattern (#159)

* fix: align voice WebSocket with reference architecture accept-first pattern

Rewrites voice_stream to match the sample-strands-agent-with-agentcore
reference architecture:

- Accept WebSocket immediately (AgentCore validates auth at proxy layer)
- Extract params via helper functions: custom header → query param → config message
- Config message always read to supplement missing params in cloud mode
- /voice/stream as main route, /ws as alias for AgentCore Runtime
- Frontend uses /voice/stream for local dev, /ws for AgentCore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add missing try block in voice_stream causing IndentationError

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add Voice Mode to Key Features in README (#160)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: colinmxs <colinmxs@users.noreply.github.com>
Co-authored-by: Colin Smith <7762103+colinmxs@users.noreply.github.com>
Co-authored-by: Phil Merrell <philmerrell@boisestate.edu>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread backend/src/agents/main_agent/__init__.py Dismissed
Comment thread backend/src/agents/main_agent/agent_types.py Dismissed
Comment thread backend/src/agents/main_agent/agent_types.py Dismissed
Comment thread backend/src/agents/main_agent/base_agent.py Dismissed
Comment thread backend/src/agents/main_agent/base_agent.py Dismissed
Comment thread backend/src/apis/inference_api/chat/voice_routes.py Dismissed
Comment thread backend/src/apis/inference_api/chat/voice_routes.py Dismissed
Comment thread backend/src/apis/inference_api/chat/voice_routes.py Dismissed
Comment thread backend/src/apis/inference_api/chat/voice_routes.py Dismissed
Comment thread backend/src/apis/inference_api/chat/voice_routes.py Dismissed
Comment thread frontend/ai.client/e2e/settings/chat-preferences.user.spec.ts Dismissed
Comment thread frontend/ai.client/e2e/settings/chat-preferences.user.spec.ts Dismissed
Comment thread frontend/ai.client/e2e/settings/profile.user.spec.ts Fixed
Comment thread frontend/ai.client/e2e/settings/profile.user.spec.ts Fixed
Comment thread frontend/ai.client/e2e/settings/profile.user.spec.ts Fixed
Comment thread frontend/ai.client/e2e/settings/profile.user.spec.ts Fixed
@ofilson ofilson merged commit 29b9623 into develop Apr 21, 2026
42 checks passed
@ofilson ofilson deleted the e2e-testing-117 branch April 21, 2026 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants