Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 2 additions & 61 deletions docs/connect/df/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,75 +86,17 @@ and operations for manipulating numerical tables and time series.
- [From data storage to data analysis: Tutorial on CrateDB and pandas]



(polars)=
## Polars

:::{rubric} About
:::

```{div}
:style: "float: right; margin-left: 0.5em"
[![](https://github.com/pola-rs/polars-static/raw/master/logos/polars-logo-dark.svg){w=180px}](https://pola.rs/)
```

[Polars] is a blazingly fast DataFrames library with language bindings for
Rust, Python, Node.js, R, and SQL. Polars is powered by a multithreaded,
vectorized query engine, it is open source, and written in Rust.

- **Fast:** Written from scratch in Rust and with performance in mind,
designed close to the machine, and without external dependencies.

- **I/O:** First class support for all common data storage layers: local,
cloud storage & databases.

- **Intuitive API:** Write your queries the way they were intended. Polars,
internally, will determine the most efficient way to execute using its query
optimizer. Polars' expressions are intuitive and empower you to write
readable and performant code at the same time.

- **Out of Core:** The streaming API allows you to process your results without
requiring all your data to be in memory at the same time.

- **Parallel:** Polars' multi-threaded query engine utilises the power of your
machine by dividing the workload among the available CPU cores without any
additional configuration.

- **Vectorized Query Engine:** Uses [Apache Arrow], a columnar data format, to
process your queries in a vectorized manner and SIMD to optimize CPU usage.
This enables cache-coherent algorithms and high performance on modern processors.

- **Open Source:** Polars is and always will be open source. Driven by an active
community of developers. Everyone is encouraged to add new features and contribute.
It is free to use under the MIT license.

:::{rubric} Data formats
:::

Polars supports reading and writing to many common data formats.
This allows you to easily integrate Polars into your existing data stack.

- Text: CSV & JSON
- Binary: Parquet, Delta Lake, AVRO & Excel
- IPC: Feather, Arrow
- Databases: MySQL, Postgres, SQL Server, Sqlite, Redshift & Oracle
- Cloud Storage: S3, Azure Blob & Azure File

```{div}
:style: "clear: both"
```

:::{rubric} Learn
:::{seealso}
Please navigate to the dedicated page about {ref}`polars`.
:::
- [Polars code examples]


[Apache Arrow]: https://arrow.apache.org/
[Dask]: https://www.dask.org/
[Dask DataFrames]: https://docs.dask.org/en/latest/dataframe.html
[Dask Futures]: https://docs.dask.org/en/latest/futures.html
[pandas]: https://pandas.pydata.org/
[Polars]: https://pola.rs/

[Dask code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/dask
[Efficient batch/bulk INSERT operations with pandas, Dask, and SQLAlchemy]: https://cratedb.com/docs/python/en/latest/by-example/sqlalchemy/dataframe.html
Expand All @@ -164,4 +106,3 @@ This allows you to easily integrate Polars into your existing data stack.
[Import weather data using Dask]: https://github.com/crate/cratedb-examples/blob/main/topic/timeseries/dask-weather-data-import.ipynb
[Importing Parquet files into CrateDB using Apache Arrow and SQLAlchemy]: https://community.cratedb.com/t/importing-parquet-files-into-cratedb-using-apache-arrow-and-sqlalchemy/1161
[pandas code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/pandas
[Polars code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/polars
1 change: 1 addition & 0 deletions docs/integrate/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ nifi/index
node-red/index
oracle/index
plotly/index
polars/index
postgresql/index
Power BI <powerbi/index>
prometheus/index
Expand Down
63 changes: 63 additions & 0 deletions docs/integrate/polars/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
(polars)=
# Polars

```{div}
:style: "float: right; margin-left: 0.5em"
[![Polars logo](https://github.com/pola-rs/polars-static/raw/master/logos/polars-logo-dark.svg){w=180px}][Polars]
```
```{div} .clearfix
```

:::{rubric} About
:::

[Polars] is a high‑performance DataFrames library with interfaces for
Rust, Python, Node.js, and R, plus a SQL context. It is powered by a
multithreaded, vectorized query engine and written in Rust.

- **Fast:** Written from scratch in Rust and with performance in mind,
designed close to the machine, and without external dependencies.

- **I/O:** First class support for all common data storage layers: local,
cloud storage & databases.

- **Intuitive API:** Write your queries the way they were intended. Polars,
internally, will determine the most efficient way to execute using its query
optimizer. Polars' expressions are intuitive and empower you to write
readable and performant code at the same time.

- **Out of Core:** The streaming API allows you to process your results without
requiring all your data to be in memory at the same time.

- **Parallel:** Polars' multi-threaded query engine utilizes the power of your
machine by dividing the workload among the available CPU cores without any
additional configuration.

- **Vectorized Query Engine:** Uses [Apache Arrow], a columnar data format, to
process your queries in a vectorized manner and SIMD to optimize CPU usage.
This enables cache-coherent algorithms and high performance on modern processors.

- **Open Source:** Polars is and always will be open source. Driven by an active
community of developers. Everyone is encouraged to add new features and contribute.
It is free to use under the MIT license.

:::{rubric} Data formats
:::

Polars supports reading and writing to many common data formats.
This allows you to easily integrate Polars into your existing data stack.

- Text: CSV, JSON
- Binary: Parquet, Delta Lake, Avro, Excel
- IPC: Feather, Arrow IPC
- Databases: MySQL, PostgreSQL, SQLite, Redshift, SQL Server, etc. (via ConnectorX)
- Cloud storage: Amazon S3, Azure Blob/ADLS (via fsspec‑compatible backends)

:::{rubric} Learn
:::
- [Polars code examples]


[Apache Arrow]: https://arrow.apache.org/
[Polars]: https://pola.rs/
[Polars code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/polars