Skip to content

exasol/dbt-exasol

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dbt-exasol

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Please see the dbt documentation on Exasol setup for more information on how to start using the Exasol adapter.

Version Compatibility

dbt-exasol dbt-core Python Exasol
1.10.x 1.10.x 3.9-3.12 7.x, 8.x
1.8.x 1.8.x 3.9-3.12 7.x, 8.x
1.7.x 1.7.x 3.8-3.11 7.x, 8.x

Current profile.yml settings

dbt-exasol:
  target: dev
  outputs:
    dev:
      type: exasol
      threads: 1
      dsn: HOST:PORT
      user: USERNAME
      password: PASSWORD
      dbname: db
      schema: SCHEMA

Optional login credentials using OpenID for Exasol SaaS

OpenID login through access_token or refresh_token instead of user+password

Optional parameters

  • connection_timeout: defaults to pyexasol default
  • socket_timeout: defaults to pyexasol default
  • query_timeout: defaults to pyexasol default
  • compression: default: False
  • encryption: default: True
  • validate_server_certificate: default: True (requires valid SSL certificate when encryption=True)
  • protocol_version: default: v3
  • row_separator: default: CRLF for windows - LF otherwise
  • timestamp_format: default: YYYY-MM-DDTHH:MI:SS.FF6

Known isues

>=1.8.1 additional parameters

As of dbt-exasol 1.8.1 it is possible to add new model config parameters for models materialized as table or incremental.

  • partition_by_config
  • distribute_by_config
  • primary_key_config
  • Example table materialization config
{{
    config(
        materialized='table',
        primary_key_config=['<column>','<column2>'],
        partition_by_config='<column>',
        distribute_by_config='<column>'
    )
}}

NOTE In case more than one column is used, put them in a list.


>=1.8 license change

As of dbt-exasol version 1.8 we have decided to switch to Apache License from GPLv3 - to be equal to dbt-core licensing.

setuptools breaking change

Due to a breaking change in setuptools and a infected dependency from dbt-core, we need to use the following workaround for poetry install.

Using encryption in Exasol 7 vs. 8

Starting from Exasol 8, encryption is enforced by default. If you are still using Exasol 7 and have trouble connecting, you can disable encryption in profiles.yaml (see optional parameters).

SSL/TLS Certificate Validation

By default, dbt-exasol validates SSL/TLS certificates when encryption=True (which is the default). This provides secure connections and suppresses PyExasol warnings about certificate validation behavior.

Default behavior (recommended for production):

outputs:
  prod:
    type: exasol
    encryption: true  # default
    validate_server_certificate: true  # default
    # ... other settings

For development/testing with self-signed certificates:

outputs:
  dev:
    type: exasol
    encryption: true
    validate_server_certificate: false  # Skip cert validation (not recommended for production)
    # ... other settings

Alternative for self-signed certificates: Use the nocertcheck fingerprint in the DSN:

outputs:
  dev:
    type: exasol
    dsn: myhost/nocertcheck:8563
    # ... other settings

For more information about SSL configuration, see the PyExasol security documentation.

Materialized View & Clone operations

In Exasol materialized views and clone operations are not suported. Default behaviour from dbt-core will fail accordingly.

Null handling in test_utils null safe handling

In Exasol empty string are NULL. Due to this behaviour and as of this pull request 7776 published in dbt-core 1.6, seeds in tests that use EMPTY literal to simulate empty string have to be handled with special behaviour in exasol. See fixture for csv in exasolseedsdata_hash_csv for tests/functional/adapter/utils/test_utils.py::TestHashExasol.

Model contracts

The following database constraints are implemented for Exasol:

Constraint Type Status
check NOT supported
not null enforced
unique NOT supported
primary key enforced
foreign key enforced

>=1.5 Incremental model update

Fallback to dbt-core implementation and supporting strategies:

  • append - Insert new rows
  • merge - Update existing rows, insert new rows
  • delete+insert - Delete matching rows, insert all rows
  • microbatch (new in 1.10) - Process data in time-based batches

Microbatch Strategy

The microbatch strategy processes data in time-based batches, enabling:

  • Efficient processing of large datasets
  • Support for late-arriving data via lookback
  • Sample mode (--sample) for development

Example configuration:

{{ config(
    materialized='incremental',
    incremental_strategy='microbatch',
    event_time='created_at',
    begin='2024-01-01',
    batch_size='day',
    lookback=2
) }}
select * from {{ ref('source_table') }}

Configuration options:

Option Required Description
event_time Yes Column used for time-based filtering
begin Yes Start date for initial backfill (YYYY-MM-DD)
batch_size Yes Size of each batch: hour, day, month, year
lookback No Number of previous batches to reprocess

See dbt Microbatch Documentation for more details.

Sample Mode

Sample mode (--sample flag) runs dbt in "small-data" mode, building only the N most recent time-based slices of microbatch models. This is useful for:

  • Development and testing with representative data
  • Quick iteration without processing full history

Example usage:

# Process only 2 most recent days
dbt run --sample="2 days"

# Process most recent week
dbt run --sample="1 week"

Requirements:

  • Models using incremental_strategy='microbatch'
  • dbt-core 1.10 or later

See Sample Mode Documentation for more details.

Microbatch/Sample Mode Notes (Exasol-specific)

Timestamp Format: Exasol requires timestamps without timezone suffix in model definitions:

-- Correct (Exasol compatible)
TIMESTAMP '2024-01-01 10:00:00'

-- Incorrect (will cause parse errors)
TIMESTAMP '2024-01-01 10:00:00-0'

The dbt-exasol adapter automatically handles timestamp formatting for microbatch boundaries.

Batch Processing:

  • Microbatch uses DELETE + INSERT pattern for batch replacement
  • Each batch window is processed as a separate transaction
  • For large datasets, consider batch_size='day' over batch_size='hour'

>=1.3 Python model not yet supported - WIP

Breaking changes with release 1.2.2

  • Timestamp format defaults to YYYY-MM-DDTHH:MI:SS.FF6

SQL functions compatibility

split_part

There is no equivalent SQL function in Exasol for split_part.

listagg part_num

The SQL function listagg in Exasol does not support the num_part parameter.

Utilities shim package

In order to support packages like dbt-utils and dbt-audit-helper, we needed to create the shim package exasol-utils. In this shim package we need to adapt to parts of the SQL functionality that is not compatible with Exasol - e.g. when 'final' is being used which is a keyword in Exasol. Please visit Adaopter dispatch documentation of dbt-labs for more information.

Reporting bugs and contributing code

  • Please report bugs using the issues

Releases

GitHub Releases

About

Data Build Tool adapter for Exasol

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 12