-
Notifications
You must be signed in to change notification settings - Fork 6
improve webhook collection testing #1085
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: rc-v0.5.17
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds end-to-end verification testing for webhook collection in GCP. The test tool now polls the expected output bucket to verify collected content appears after triggering a webhook, and the GCP test-all.sh script now includes webhook collector tests.
Changes:
- Added
--verify-collectionand--scheduler-joboptions to the test CLI for end-to-end webhook verification - Implemented bucket polling and content verification logic for GCP webhook collectors
- Updated GCP log retrieval to support both Gen 1 and Gen 2 Cloud Functions
- Integrated webhook collector tests into GCP's
test-all.shscript
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/psoxy-test/package.json | Added @google-cloud/scheduler dependency for triggering batch jobs |
| tools/psoxy-test/lib/utils.js | Exported new sleep utility function for polling delays |
| tools/psoxy-test/lib/gcp.js | Added webhook collection verification with scheduler triggering and bucket polling |
| tools/psoxy-test/cli-call.js | Integrated verification logic into CLI command flow |
| infra/modules/gcp-webhook-collector/test_script.tftpl | Added verification flags to generated test script |
| infra/modules/gcp-webhook-collector/main.tf | Added SERVICE_URL secret access and test script parameters |
| infra/modules/gcp-host/main.tf | Added webhook collector tests to test-all.sh |
| docs/guides/psoxy-test-tool.md | Documented new webhook verification options |
Files not reviewed (1)
- tools/psoxy-test/package-lock.json: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
davidfq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing wrong spotted, only that I need more context... (point to docs if I'm missing obvious stuff) Why GCP use-case needs the Cloud Scheduler job and AWS doesn't? For AWS you add permissions for test caller to read from bucket... what about GCP use-case?
jlorper
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, agree with @davidfq . Check the await missing that cursor highlighted.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
@davidfq context regarding Cloud Schedule; in AWS we use SQS, which has methods to batch trigger endpoints after given batch size OR time out - so we can rely on that to collect batches of webhooks --> SQS to buffer them; then call again to lambda to write whole batch as a file --> S3. there's no GCP equivalent - cloud tasks / pub sub don't have similar features. Their push queues are 1:1, can't batch like that; so we have to use pull, with no way to "trigger" a pull after given batch size/timeout. So replicating the behavior with a Cloud Scheduler job that pulls every 5 minutes or something; and keeps pull if full batch of 100 is found, until consummed the queue. hence for test purposes we force Cloud Scheduler job run immediately, rather than on its schedule. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
|
|
||
|
|
||
| const isGcp = options.force?.toLowerCase() === 'gcp' || gcp.isValidURL(url); | ||
| const isAws = options.force?.toLowerCase() === 'aws' || (!isGcp && (url.hostname.endsWith('amazonaws.com') || url.hostname.endsWith('on.aws'))); // rough check or rely on fallback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused isAws variable is dead code
Low Severity
The isAws variable is computed at line 101 but never actually used. The conditional logic only checks if (isGcp) and falls through to else for the AWS case, making the isAws computation entirely redundant. This dead code could confuse future maintainers who might expect isAws to influence the branching logic.
| collection_path = "/" | ||
| scheduler_job_name = google_cloud_scheduler_job.trigger_batch_processing.id | ||
| bucket_name = module.sanitized_webhook_output.bucket_name | ||
| output_path_prefix = var.output_path_prefix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Template parameter output_path_prefix passed but unused
Low Severity
The output_path_prefix variable is passed to the test_script.tftpl template but is never used within the template. The template only uses bucket_name, scheduler_job_name, and other variables, making this a dead parameter that adds unnecessary clutter to the template invocation.
| example_payload = coalesce(var.example_payload, "{\"test\": \"data\"}") | ||
| example_identity = var.example_identity | ||
| collection_path = "/" | ||
| scheduler_job_name = google_cloud_scheduler_job.trigger_batch_processing.id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing IAM grant for Cloud Scheduler run permission
High Severity
The test script passes --scheduler-job to trigger the Cloud Scheduler job via triggerScheduler(), which calls client.runJob(). This requires cloudscheduler.jobs.run permission. However, the Terraform grants test principals only KMS signing and bucket read permissions—there's no IAM binding for Cloud Scheduler. Test principals will receive a permission denied error when the test attempts to trigger the scheduler.
Features
test-all.shChange implications
Note
Adds E2E verification for webhook collection in the test tool and integrates it into infra test scripts.
--verify-collectionand--scheduler-joboptions; after a successful POST, tool polls GCS/S3 and compares output content to the request body (GCP triggers Cloud Scheduler when provided)lib/gcp.jsandlib/aws.js; shared helperscompareContentandsleepadded toutils.jswith teststest-all.sh(GCP/AWS), pass output bucket info into test templates, and grant caller S3 read on sanitized buckets (AWS); GCP grants function access toSERVICE_URLsecretWritten by Cursor Bugbot for commit 205bd2b. This will update automatically on new commits. Configure here.