From a2d054fabd6b0d87186f1bdd2d53fa8999664ca3 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Fri, 15 Aug 2025 00:15:32 +0200 Subject: [PATCH 1/2] Integrate/PostgreSQL: Add section with starter tutorial The goal is to present concise walkthroughs without many bells and whistles, which get to the point of getting you started quickly. Use the canonical template to present another data nozzle based on the CTK Ingestr I/O subsystem. --- docs/ingest/etl/index.md | 5 ++ docs/integrate/index.md | 1 + docs/integrate/postgresql/index.md | 45 +++++++++++++ docs/integrate/postgresql/tutorial.md | 95 +++++++++++++++++++++++++++ 4 files changed, 146 insertions(+) create mode 100644 docs/integrate/postgresql/index.md create mode 100644 docs/integrate/postgresql/tutorial.md diff --git a/docs/ingest/etl/index.md b/docs/ingest/etl/index.md index 8d5b8a83..268544fc 100644 --- a/docs/ingest/etl/index.md +++ b/docs/ingest/etl/index.md @@ -121,6 +121,10 @@ Use visual data flow and integration frameworks and platforms. MySQL and MariaDB are well-known free and open-source relational database management systems (RDBMS), available as standalone and managed variants. +- {ref}`postgresql` + + PostgreSQL is the world's most advanced open source relational database. + - {ref}`sql-server` Microsoft SQL Server Integration Services (SSIS) is a component of the Microsoft SQL @@ -237,6 +241,7 @@ Load data from datasets and open table formats. - {ref}`n8n` - {ref}`nifi` - {ref}`node-red` +- {ref}`postgresql` - {ref}`risingwave` - {ref}`sql-server` - {ref}`streamsets` diff --git a/docs/integrate/index.md b/docs/integrate/index.md index de852fc0..e15d441e 100644 --- a/docs/integrate/index.md +++ b/docs/integrate/index.md @@ -52,6 +52,7 @@ n8n/index nifi/index node-red/index plotly/index +postgresql/index Power BI prometheus/index pyviz/index diff --git a/docs/integrate/postgresql/index.md b/docs/integrate/postgresql/index.md new file mode 100644 index 00000000..4496bbb5 --- /dev/null +++ b/docs/integrate/postgresql/index.md @@ -0,0 +1,45 @@ +(postgresql)= +# PostgreSQL + +```{div} .float-right +[![postgresql-logo](https://www.postgresql.org/media/img/about/press/elephant.png){height=60px loading=lazy}][PostgreSQL] +``` +```{div} .clearfix +``` + +:::{rubric} About +::: + +[PostgreSQL] is the world's most advanced open source relational database. + +:::{rubric} Synopsis +::: + +```shell +uvx 'cratedb-toolkit[io-ingestr]' load table \ + "postgresql://postgres:postgres@localhost:5432/test?table=public.demo" \ + --cluster-url="crate://crate:crate@localhost:4200/doc/postgresql_demo" +``` + +:::{rubric} Learn +::: + +::::{grid} + +:::{grid-item-card} Tutorial: Use CrateDB Toolkit +:link: postgresql-tutorial +:link-type: ref +Load data from PostgreSQL into CrateDB using CrateDB Toolkit. +::: + +:::: + + +:::{toctree} +:maxdepth: 1 +:hidden: +Tutorial +::: + + +[PostgreSQL]: https://www.postgresql.org/ diff --git a/docs/integrate/postgresql/tutorial.md b/docs/integrate/postgresql/tutorial.md new file mode 100644 index 00000000..c7ce2c64 --- /dev/null +++ b/docs/integrate/postgresql/tutorial.md @@ -0,0 +1,95 @@ +(postgresql-tutorial)= +# Load data from PostgreSQL into CrateDB + +The tutorial will walk you through starting [PostgreSQL] and CrateDB, +inserting a record into PostgreSQL, loading data into a CrateDB table, +and validating that the data has been stored successfully. +The data transfer is supported by the +{ref}`CrateDB Toolkit Ingestr I/O ` data pipeline elements. + +## Prerequisites + +Docker is used for running all components. This approach works consistently +across Linux, macOS, and Windows. Alternatively, you can use Podman. + +Create a shared network. +```shell +docker network create cratedb-demo +``` + +Start CrateDB. +```shell +docker run --rm --name=cratedb --network=cratedb-demo \ + --publish=4200:4200 --publish=5432:5432 --env=CRATE_HEAP_SIZE=2g \ + docker.io/crate -Cdiscovery.type=single-node +``` + +Start PostgreSQL. +```shell +docker run --rm --name=postgresql --network=cratedb-demo \ + --publish=6432:5432 --env "POSTGRES_HOST_AUTH_METHOD=trust" \ + docker.io/postgres postgres -c log_statement=all +``` +:::{note} +Because CrateDB is configured to listen on port `5432` with its PostgreSQL +interface, let's use a different port for PostgreSQL itself. +::: + +Prepare shortcuts for the CrateDB shell, CrateDB Toolkit, and the PostgreSQL client +programs. + +::::{tab-set} + +:::{tab-item} Linux and macOS +To make the settings persistent, add them to your shell profile (`~/.profile`). +```shell +alias crash="docker run --rm -it --network=cratedb-demo ghcr.io/crate/cratedb-toolkit crash" +alias ctk-ingest="docker run --rm -i --network=cratedb-demo ghcr.io/crate/cratedb-toolkit-ingest ctk" +alias psql="docker run --rm -i --network=cratedb-demo docker.io/postgres psql" +``` +::: +:::{tab-item} Windows PowerShell +To make the settings persistent, add them to your PowerShell profile (`$PROFILE`). +```powershell +function crash { docker run --rm -it --network=cratedb-demo ghcr.io/crate/cratedb-toolkit crash @args } +function ctk-ingest { docker run --rm -i --network=cratedb-demo ghcr.io/crate/cratedb-toolkit-ingest ctk @args } +function psql { docker run --rm -i --network=cratedb-demo docker.io/postgres psql @args } +``` +::: +:::{tab-item} Windows Command +```shell +doskey crash=docker run --rm -it --network=cratedb-demo ghcr.io/crate/cratedb-toolkit crash $* +doskey ctk-ingest=docker run --rm -i --network=cratedb-demo ghcr.io/crate/cratedb-toolkit-ingest ctk $* +doskey psql=docker run --rm -i --network=cratedb-demo docker.io/postgres psql $* +``` +::: + +:::: + +## Usage + +Write a few sample records to PostgreSQL. +```shell +psql "postgresql://postgres:postgres@postgresql:5432/" < Date: Sun, 17 Aug 2025 00:26:50 +0200 Subject: [PATCH 2/2] Integrate/PostgreSQL: Implement suggestions by CodeRabbit --- docs/integrate/postgresql/tutorial.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/integrate/postgresql/tutorial.md b/docs/integrate/postgresql/tutorial.md index c7ce2c64..501a0195 100644 --- a/docs/integrate/postgresql/tutorial.md +++ b/docs/integrate/postgresql/tutorial.md @@ -20,7 +20,7 @@ docker network create cratedb-demo Start CrateDB. ```shell docker run --rm --name=cratedb --network=cratedb-demo \ - --publish=4200:4200 --publish=5432:5432 --env=CRATE_HEAP_SIZE=2g \ + --publish=4200:4200 --publish=5432:5432 \ docker.io/crate -Cdiscovery.type=single-node ``` @@ -34,6 +34,10 @@ docker run --rm --name=postgresql --network=cratedb-demo \ Because CrateDB is configured to listen on port `5432` with its PostgreSQL interface, let's use a different port for PostgreSQL itself. ::: +:::{note} +Using `POSTGRES_HOST_AUTH_METHOD=trust` disables password checks. +Use it for local demos only. +::: Prepare shortcuts for the CrateDB shell, CrateDB Toolkit, and the PostgreSQL client programs.