Merge development (with our current Datawarehouse code) into the AWS branch.#232
Merge development (with our current Datawarehouse code) into the AWS branch.#232jgrantr wants to merge 72 commits intofeature/aws-sdk-v3-againfrom
development (with our current Datawarehouse code) into the AWS branch.#232Conversation
…onnectors into redshift-load-optimization
…onnectors into redshift-load-optimization
…. It now checks for null and undefined
S3 entity table loading/unloading
Redshift load optimization
ES-2516 - reset deleted flag on update or insert
…fixes ES-2516 - don't write deletes to the CSV file/staging table
⛔ Snyk checks have failed. 31 issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
| * @param error {string} | ||
| */ | ||
| function handleFailedValidation (ID, source, eventObj, error) { | ||
| function handleFailedValidation(ID, source, eventObj, error) { |
There was a problem hiding this comment.
Bug: Incorrect Case-Sensitive Property Check Breaks Error Stream Initialization
Same incorrect property check as above. The code checks !errorStream.Writable but should check !errorStream.writable (lowercase w). This will cause the error stream initialization logic to be executed every time handleFailedValidation is called, potentially creating multiple pipelines for the same error stream.
feat: add source id to dim error event
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
| useSurrogateDateKeys: true, | ||
| }, columnConfig || {}); | ||
|
|
||
| console.log(`has delete_fix in-place`); |
There was a problem hiding this comment.
Debug console.log left in production code
Medium Severity
A console.log(has delete_fix in-place) statement is left in the module initialization path of dwconnect.js. This will print to stdout every time the module is loaded, which is noisy in production and clearly debug/development code that wasn't intended to be committed.
| if (err) { | ||
| return done(err); | ||
| } else { | ||
| if (results[0].sortKey != null) { |
There was a problem hiding this comment.
Case mismatch prevents sortKey from being read in importFact
High Severity
In importFact, the check results[0].sortKey != null uses camelCase sortKey, but the SQL query returns lowercase column names (sortkey). This means results[0].sortKey is always undefined, so sortKey and sortKeyType are never assigned. This silently disables the sort key optimization for Redshift staging tables and the natural key filter logic in the hashed surrogate keys path, potentially causing full table scans and incorrect SORTKEY clauses.
| if (err) { | ||
| return done(err); | ||
| } else { | ||
| if (results[0].sortKey != null) { |
There was a problem hiding this comment.
Case mismatch prevents sortKey from being read in linkDimensions
High Severity
In linkDimensions, the check results[0].sortKey != null has the same camelCase typo as importFact. The SQL query returns sortkey (lowercase), so sortKey and sortKeyType are never assigned. This disables the natural key filter optimization for the dimension linking queries on Redshift, causing unnecessary full table scans during UPDATE operations.


Note
High Risk
High risk because it changes core data-warehouse ingest semantics (delete/merge behavior, dimension/fact upsert strategy, and Redshift load paths) and introduces new configuration-dependent branches that affect data correctness and performance.
Overview
Data warehouse ingest behavior is updated.
common/datawarehouse/combine.jsnow treats either the current or previous record being marked__leo_delete__as overriding merges, changing how delete-vs-update sequences collapse.Failed validation routing is changed.
common/datawarehouse/load.jsupdates the error pipeline to emit to${source}_errorvials.toLeo()and addspayload.source_idto preserve the original event id.Postgres connector gains Redshift-oriented load paths and hashed key options.
postgres/lib/connect.jsaddsstreamToTableFromS3()to stage CSVs in S3 and thenCOPYinto tables (optionally deleting the S3 object), andpostgres/lib/dwconnect.jsadds config-driven branches for Redshift staging (DISTSTYLE/SORTKEY), adjustable delete flushing, safer temp-table cleanup, optional bypass of SCD processing, andhashedSurrogateKeyssupport (including bigint surrogate/dimension columns and fingerprint-based dimension linking).Packaging/release updates. Versions are bumped across
common,entity-table, andpostgres, Postgres now pinsleo-connector-common@4.0.13-rcand adds@dsco/layer-leodevDependency, and a newPublish-PostgresGitHub Actions workflow publishespostgresto GitHub Packages on release publish.Written by Cursor Bugbot for commit 4f13d68. This will update automatically on new commits. Configure here.