Skip to content

Commit 3466c5d

Browse files
committed
Integrate/dlt+ingestr: Implement suggestions by CodeRabbit
1 parent 318b4af commit 3466c5d

File tree

3 files changed

+36
-22
lines changed

3 files changed

+36
-22
lines changed

docs/ingest/etl/index.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -239,13 +239,15 @@ Load data from datasets and open table formats.
239239
- {ref}`aws-lambda`
240240
- {ref}`azure-functions`
241241
- {ref}`dbt`
242+
- {ref}`dlt`
242243
- {ref}`dms`
243244
- {ref}`dynamodb`
244245
- {ref}`estuary`
245246
- {ref}`flink`
246247
- {ref}`hop`
247248
- {ref}`iceberg`
248249
- {ref}`influxdb`
250+
- {ref}`ingestr`
249251
- {ref}`kafka`
250252
- {ref}`kestra`
251253
- {ref}`kinesis`

docs/integrate/dlt/index.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
```{div} .clearfix
1111
```
1212

13-
[dlt] (data load tool)--think ELT as Python code--is the most popular
13+
[dlt] (data load tool)think ELT as Python codeis a popular,
1414
production-ready Python library for moving data. It loads data from
1515
various and often messy data sources into well-structured, live datasets.
1616
dlt is used by {ref}`ingestr`.
@@ -21,7 +21,7 @@ dlt is used by {ref}`ingestr`.
2121
- **Just code**: no need to use any backends or containers.
2222

2323
- **Platform agnostic**: Does not replace your data platform, deployments, or security
24-
models. Simply import dlt in your favorite AI code editor, or add it to your Jupyter
24+
models. Simply import dlt in your favorite code editor, or add it to your Jupyter
2525
Notebook.
2626

2727
- **Versatile**: You can load data from any source that produces Python data structures,
@@ -33,6 +33,12 @@ dlt is used by {ref}`ingestr`.
3333

3434
## Synopsis
3535

36+
Prerequisites:
37+
Install dlt and the CrateDB destination adapter:
38+
```shell
39+
pip install dlt dlt-cratedb
40+
```
41+
3642
Load data from cloud storage or files into CrateDB.
3743
```python
3844
import dlt

docs/integrate/ingestr/index.md

Lines changed: 26 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,9 @@
88
```{div} .clearfix
99
```
1010

11-
[ingestr] is a command-line application that allows copying data from any
12-
source into any destination database. It supports CrateDB on the source
13-
and the destination side. ingestr uses {ref}`dlt`.
11+
[ingestr] is a command-line application for copying data from any source
12+
to any destination database. It supports CrateDB on both the source and
13+
destination sides. ingestr builds on {ref}`dlt`.
1414

1515
::::{grid}
1616

@@ -53,7 +53,7 @@ ingestr ingest \
5353
```
5454

5555
:::{note}
56-
Please note there a subtle differences in the CrateDB source vs. target URL.
56+
Please note there are subtle differences between the CrateDB source and target URLs.
5757
While `--source-uri=crate://...` addresses CrateDB's SQLAlchemy dialect,
5858
`--dest-uri=cratedb://...` is effectively a PostgreSQL connection URL
5959
with a protocol schema designating CrateDB. The source adapter uses
@@ -64,33 +64,40 @@ PostgreSQL interface.
6464

6565
## Coverage
6666

67-
ingestr supports migration from 20-plus databases, data platforms, analytics
67+
ingestr supports migration from 20-plus databases, data platforms, and analytics
6868
engines, including all [databases supported by SQLAlchemy].
6969

70-
:::{rubric} Databases
70+
:::{rubric} Traditional Databases
7171
:::
72-
Actian Data Platform, Vector, Actian X, Ingres, Amazon Athena, Amazon Redshift,
73-
Amazon S3, Apache Drill, Apache Druid, Apache Hive and Presto, Apache Solr,
74-
Clickhouse, CockroachDB, CrateDB, Databend, Databricks, Denodo, DuckDB, EXASOL DB,
75-
Elasticsearch, Firebird, Firebolt, Google BigQuery, Google Sheets, Greenplum,
76-
HyperSQL (hsqldb), IBM DB2 and Informix, IBM Netezza Performance Server, Impala, InfluxDB,
77-
Kinetica, Microsoft Access, Microsoft SQL Server, MonetDB, MongoDB, MySQL and MariaDB,
78-
OpenGauss, OpenSearch, Oracle, PostgreSQL, Rockset, SAP ASE, SAP HANA,
79-
SAP Sybase SQL Anywhere, Snowflake, SQLite, Teradata Vantage, TiDB, YDB, YugabyteDB.
80-
81-
:::{rubric} Brokers
72+
CockroachDB, CrateDB, Firebird, HyperSQL (hsqldb), IBM DB2 and Informix,
73+
Microsoft Access, Microsoft SQL Server, MonetDB, MySQL and MariaDB,
74+
OpenGauss, Oracle, PostgreSQL, SAP ASE, SAP HANA, SAP Sybase SQL Anywhere,
75+
SQLite, TiDB, YDB, YugabyteDB
76+
77+
:::{rubric} Cloud Data Warehouses & Analytics
78+
:::
79+
Amazon Athena, Amazon Redshift, Databend, Databricks, Denodo, DuckDB,
80+
EXASOL DB, Firebolt, Google BigQuery, Greenplum, IBM Netezza Performance Server,
81+
Impala, Kinetica, Rockset, Snowflake, Teradata Vantage
82+
83+
:::{rubric} Specialized Data Stores
84+
:::
85+
Apache Drill, Apache Druid, Apache Hive and Presto, Clickhouse, Elasticsearch,
86+
InfluxDB, MongoDB, OpenSearch
87+
88+
:::{rubric} Message Brokers
8289
:::
8390
Amazon Kinesis, Apache Kafka (Amazon MSK, Confluent Kafka, Redpanda, RobustMQ)
8491

85-
:::{rubric} File formats
92+
:::{rubric} File Formats
8693
:::
8794
CSV, JSONL/NDJSON, Parquet
8895

89-
:::{rubric} Object stores
96+
:::{rubric} Object Stores
9097
:::
9198
Amazon S3, Google Cloud Storage
9299

93-
:::{rubric} Services
100+
:::{rubric} SaaS Platforms & Services
94101
:::
95102
Airtable, Asana, GitHub, Google Ads, Google Analytics, Google Sheets, HubSpot,
96103
Notion, Personio, Salesforce, Slack, Stripe, Zendesk, etc.
@@ -125,4 +132,3 @@ load data from Kafka to CrateDB.
125132

126133
[databases supported by SQLAlchemy]: https://docs.sqlalchemy.org/en/20/dialects/
127134
[ingestr]: https://bruin-data.github.io/ingestr/
128-
[sources supported by ingestr]: https://bruin-data.github.io/ingestr/supported-sources/

0 commit comments

Comments
 (0)