Skip to content

Commit 14c9331

Browse files
committed
Dask: s/tutorial/usage guide/
1 parent 9fd900a commit 14c9331

File tree

2 files changed

+7
-6
lines changed

2 files changed

+7
-6
lines changed

docs/integrate/dask/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ the Python libraries that you know and love, like NumPy, pandas, and scikit-lear
2626

2727
:::{rubric} Learn
2828
:::
29-
- {ref}`dask-tutorial`
29+
- {ref}`dask-usage`
3030
- [Efficient batch/bulk INSERT operations with pandas, Dask, and SQLAlchemy]
3131
- [Import weather data using Dask]
3232
- [Dask code examples]
@@ -35,7 +35,7 @@ the Python libraries that you know and love, like NumPy, pandas, and scikit-lear
3535
:::{toctree}
3636
:maxdepth: 1
3737
:hidden:
38-
Tutorial <tutorial>
38+
Usage <usage>
3939
:::
4040

4141

docs/integrate/dask/tutorial.md renamed to docs/integrate/dask/usage.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
1-
(dask-tutorial)=
1+
(dask-usage)=
22
# Efficient data ingestion with Dask and CrateDB
33

44
## Introduction
5-
Dask is a parallel computing library that enables distributed computing for tasks such as data processing and machine learning. In this tutorial, we'll explore how to leverage the power of CrateDB, a distributed SQL database, in conjunction with Dask, to perform efficient data processing and analysis tasks.
5+
Dask is a parallel computing library that enables distributed computing for tasks such as data processing and machine learning.
6+
In this usage guide, we'll explore how to leverage the power of CrateDB, a distributed SQL database, in conjunction with Dask, to perform efficient data processing and analysis tasks.
67

78
Prerequisites:
89

@@ -20,7 +21,7 @@ python -m pip install "dask[complete]"
2021

2122
## Inserting data
2223

23-
For this tutorial, we chose to use the California housing prices dataset, also available on [Kaggle](https://www.kaggle.com/datasets/camnugent/california-housing-prices?resource=download). This dataset is a popular dataset for regression tasks, consisting of median house values in census tracts in California, making it an excellent starting point for implementing basic machine learning algorithms.
24+
For this usage guide, we chose to use the California housing prices dataset, also available on [Kaggle](https://www.kaggle.com/datasets/camnugent/california-housing-prices?resource=download). This dataset is a popular dataset for regression tasks, consisting of median house values in census tracts in California, making it an excellent starting point for implementing basic machine learning algorithms.
2425

2526
Before importing data, create a california_housing table in CrateDB:
2627

@@ -190,6 +191,6 @@ On an M1 machine with 16 GB of RAM, the entire process of loading the 1.5 millio
190191

191192
## Conclusions
192193

193-
In this tutorial, we've covered the essentials of using CrateDB with Dask for efficient data processing and analysis. By combining the distributed capabilities of CrateDB with the parallel computing power of Dask, you can unlock the potential to handle large-scale datasets, perform complex queries, and leverage advanced analytics techniques.
194+
In this usage guide, we've covered the essentials of using CrateDB with Dask for efficient data processing and analysis. By combining the distributed capabilities of CrateDB with the parallel computing power of Dask, you can unlock the potential to handle large-scale datasets, perform complex queries, and leverage advanced analytics techniques.
194195

195196
To learn more about updates, features, and other questions you might have, join our [CrateDB community](https://community.cratedb.com/).

0 commit comments

Comments
 (0)