chore(ods-etl): FTRS-35881 FTRS-3588 Change concurrency on etl lambdas#1013
chore(ods-etl): FTRS-35881 FTRS-3588 Change concurrency on etl lambdas#1013
Conversation
There was a problem hiding this comment.
Pull request overview
Adjusts the ETL ODS Lambda/SQS processing configuration to increase throughput/parallelism and reduce batching latency, aiming to improve performance of the ODS ETL pipeline.
Changes:
- Increased
transformer_lambdareserved concurrency. - Tuned SQS event source mappings for transformer and consumer (smaller batch size, higher max concurrency, shorter batching window).
You can also share your feedback on Copilot code review. Take the survey.
| timeout = var.transformer_lambda_connection_timeout | ||
| memory_size = var.lambda_memory_size | ||
| reserved_concurrent_executions = 5 | ||
| reserved_concurrent_executions = 21 |
There was a problem hiding this comment.
These scaling/performance-critical values are hard-coded (reserved concurrency = 21). In other stacks (e.g., infrastructure/stacks/data_migration/lambda.tf) queue tuning is driven via variables, which makes it safer to adjust per environment and avoids redeploys for simple tuning. Consider introducing ETL ODS variables for reserved concurrency (and potentially defaulting per var.environment) and wiring them through instead of a literal.
| batch_size = 3 | ||
| enabled = true | ||
| function_response_types = ["ReportBatchItemFailures"] | ||
|
|
||
| scaling_config { | ||
| maximum_concurrency = 5 | ||
| maximum_concurrency = 21 | ||
| } | ||
|
|
||
| maximum_batching_window_in_seconds = 5 | ||
| maximum_batching_window_in_seconds = 2 |
There was a problem hiding this comment.
batch_size, maximum_concurrency, and maximum_batching_window_in_seconds are now hard-coded to new tuning values. To make operational tuning safer and consistent with other stacks, consider promoting these to variables (with documented defaults) so they can be adjusted per environment without editing this module call site.
| batch_size = 3 | ||
| enabled = true | ||
| function_response_types = ["ReportBatchItemFailures"] | ||
|
|
||
| scaling_config { | ||
| maximum_concurrency = 5 | ||
| maximum_concurrency = 21 | ||
| } | ||
|
|
||
| maximum_batching_window_in_seconds = 5 | ||
| maximum_batching_window_in_seconds = 2 |
There was a problem hiding this comment.
Same as above: these SQS event-source tuning parameters are hard-coded (batch size / max concurrency / batching window). If these values need to vary by environment or be tuned over time, it would be more maintainable to define variables for them and reference those here.
| timeout = var.transformer_lambda_connection_timeout | ||
| memory_size = var.lambda_memory_size | ||
| reserved_concurrent_executions = 5 | ||
| reserved_concurrent_executions = 21 |
There was a problem hiding this comment.
The PR description/template is currently empty (no Description/Context) and the sensitive-information confirmation checkbox is not ticked. Please complete the PR template so reviewers can validate the performance issue being addressed and confirm the no-PII declaration.
Description
Context
Sensitive Information Declaration
To ensure the utmost confidentiality and protect your and others privacy, we kindly ask you to NOT including PII (Personal Identifiable Information) / PID (Personal Identifiable Data) or any other sensitive data in this PR (Pull Request) and the codebase changes. We will remove any PR that do contain any sensitive information. We really appreciate your cooperation in this matter.