diff --git a/docs/connect/df/index.md b/docs/connect/df/index.md index 7313a199..50cc9e01 100644 --- a/docs/connect/df/index.md +++ b/docs/connect/df/index.md @@ -86,67 +86,10 @@ and operations for manipulating numerical tables and time series. - [From data storage to data analysis: Tutorial on CrateDB and pandas] - -(polars)= ## Polars - -:::{rubric} About -::: - -```{div} -:style: "float: right; margin-left: 0.5em" -[![](https://github.com/pola-rs/polars-static/raw/master/logos/polars-logo-dark.svg){w=180px}](https://pola.rs/) -``` - -[Polars] is a blazingly fast DataFrames library with language bindings for -Rust, Python, Node.js, R, and SQL. Polars is powered by a multithreaded, -vectorized query engine, it is open source, and written in Rust. - -- **Fast:** Written from scratch in Rust and with performance in mind, - designed close to the machine, and without external dependencies. - -- **I/O:** First class support for all common data storage layers: local, - cloud storage & databases. - -- **Intuitive API:** Write your queries the way they were intended. Polars, - internally, will determine the most efficient way to execute using its query - optimizer. Polars' expressions are intuitive and empower you to write - readable and performant code at the same time. - -- **Out of Core:** The streaming API allows you to process your results without - requiring all your data to be in memory at the same time. - -- **Parallel:** Polars' multi-threaded query engine utilises the power of your - machine by dividing the workload among the available CPU cores without any - additional configuration. - -- **Vectorized Query Engine:** Uses [Apache Arrow], a columnar data format, to - process your queries in a vectorized manner and SIMD to optimize CPU usage. - This enables cache-coherent algorithms and high performance on modern processors. - -- **Open Source:** Polars is and always will be open source. Driven by an active - community of developers. Everyone is encouraged to add new features and contribute. - It is free to use under the MIT license. - -:::{rubric} Data formats -::: - -Polars supports reading and writing to many common data formats. -This allows you to easily integrate Polars into your existing data stack. - -- Text: CSV & JSON -- Binary: Parquet, Delta Lake, AVRO & Excel -- IPC: Feather, Arrow -- Databases: MySQL, Postgres, SQL Server, Sqlite, Redshift & Oracle -- Cloud Storage: S3, Azure Blob & Azure File - -```{div} -:style: "clear: both" -``` - -:::{rubric} Learn +:::{seealso} +Please navigate to the dedicated page about {ref}`polars`. ::: -- [Polars code examples] [Apache Arrow]: https://arrow.apache.org/ @@ -154,7 +97,6 @@ This allows you to easily integrate Polars into your existing data stack. [Dask DataFrames]: https://docs.dask.org/en/latest/dataframe.html [Dask Futures]: https://docs.dask.org/en/latest/futures.html [pandas]: https://pandas.pydata.org/ -[Polars]: https://pola.rs/ [Dask code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/dask [Efficient batch/bulk INSERT operations with pandas, Dask, and SQLAlchemy]: https://cratedb.com/docs/python/en/latest/by-example/sqlalchemy/dataframe.html @@ -164,4 +106,3 @@ This allows you to easily integrate Polars into your existing data stack. [Import weather data using Dask]: https://github.com/crate/cratedb-examples/blob/main/topic/timeseries/dask-weather-data-import.ipynb [Importing Parquet files into CrateDB using Apache Arrow and SQLAlchemy]: https://community.cratedb.com/t/importing-parquet-files-into-cratedb-using-apache-arrow-and-sqlalchemy/1161 [pandas code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/pandas -[Polars code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/polars diff --git a/docs/integrate/index.md b/docs/integrate/index.md index 96152972..9da8a985 100644 --- a/docs/integrate/index.md +++ b/docs/integrate/index.md @@ -55,6 +55,7 @@ nifi/index node-red/index oracle/index plotly/index +polars/index postgresql/index Power BI prometheus/index diff --git a/docs/integrate/polars/index.md b/docs/integrate/polars/index.md new file mode 100644 index 00000000..a992cd35 --- /dev/null +++ b/docs/integrate/polars/index.md @@ -0,0 +1,63 @@ +(polars)= +# Polars + +```{div} +:style: "float: right; margin-left: 0.5em" +[![Polars logo](https://github.com/pola-rs/polars-static/raw/master/logos/polars-logo-dark.svg){w=180px}][Polars] +``` +```{div} .clearfix +``` + +:::{rubric} About +::: + +[Polars] is a high‑performance DataFrames library with interfaces for +Rust, Python, Node.js, and R, plus a SQL context. It is powered by a +multithreaded, vectorized query engine and written in Rust. + +- **Fast:** Written from scratch in Rust and with performance in mind, + designed close to the machine, and without external dependencies. + +- **I/O:** First class support for all common data storage layers: local, + cloud storage & databases. + +- **Intuitive API:** Write your queries the way they were intended. Polars, + internally, will determine the most efficient way to execute using its query + optimizer. Polars' expressions are intuitive and empower you to write + readable and performant code at the same time. + +- **Out of Core:** The streaming API allows you to process your results without + requiring all your data to be in memory at the same time. + +- **Parallel:** Polars' multi-threaded query engine utilizes the power of your + machine by dividing the workload among the available CPU cores without any + additional configuration. + +- **Vectorized Query Engine:** Uses [Apache Arrow], a columnar data format, to + process your queries in a vectorized manner and SIMD to optimize CPU usage. + This enables cache-coherent algorithms and high performance on modern processors. + +- **Open Source:** Polars is and always will be open source. Driven by an active + community of developers. Everyone is encouraged to add new features and contribute. + It is free to use under the MIT license. + +:::{rubric} Data formats +::: + +Polars supports reading and writing to many common data formats. +This allows you to easily integrate Polars into your existing data stack. + +- Text: CSV, JSON +- Binary: Parquet, Delta Lake, Avro, Excel +- IPC: Feather, Arrow IPC +- Databases: MySQL, PostgreSQL, SQLite, Redshift, SQL Server, etc. (via ConnectorX) +- Cloud storage: Amazon S3, Azure Blob/ADLS (via fsspec‑compatible backends) + +:::{rubric} Learn +::: +- [Polars code examples] + + +[Apache Arrow]: https://arrow.apache.org/ +[Polars]: https://pola.rs/ +[Polars code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/polars