Skip to content

Commit eeb50f9

Browse files
committed
Spark: s/tutorial/usage/
1 parent 12f0cc7 commit eeb50f9

File tree

2 files changed

+21
-15
lines changed

2 files changed

+21
-15
lines changed

docs/integrate/spark/index.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -39,8 +39,8 @@ processing, real-time streaming, machine learning, and graph analytics.
3939

4040
::::{grid}
4141

42-
:::{grid-item-card} Tutorial: Getting started with Apache Spark and CrateDB
43-
:link: spark-getting-started
42+
:::{grid-item-card} Getting started with Apache Spark and CrateDB
43+
:link: spark-usage
4444
:link-type: ref
4545
Using Apache Spark with CrateDB is a powerful combination for processing and
4646
analyzing large datasets.
@@ -58,7 +58,7 @@ Demonstrate how to load a Spark data frame into CrateDB.
5858
:::{toctree}
5959
:maxdepth: 1
6060
:hidden:
61-
Tutorial <tutorial>
61+
Usage <usage>
6262
:::
6363

6464

docs/integrate/spark/tutorial.md renamed to docs/integrate/spark/usage.md

Lines changed: 18 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
1-
(spark-getting-started)=
1+
(spark-usage)=
22
# Getting started with Apache Spark and CrateDB
33

44
**Apache Spark** is an open-source distributed computing framework designed for high-speed, versatile big-data processing. It offers support for various data processing tasks, such as batch processing, real-time streaming, machine learning, and graph analytics. It is a popular choice for organizations looking to analyze large datasets efficiently.
55

6-
Using Apache Spark with CrateDB is a powerful combination for processing and analyzing large datasets. In this tutorial, we'll walk through the process of setting up PySpark (Python API for Spark) to work with CrateDB, including data loading, processing, and writing results back to CrateDB.
6+
Using Apache Spark with CrateDB is a powerful combination for processing and analyzing large datasets. In this usage guide, we'll walk through the process of setting up PySpark (Python API for Spark) to work with CrateDB, including data loading, processing, and writing results back to CrateDB.
77

88
Prerequisites:
99

@@ -39,27 +39,33 @@ FROM
3939

4040
## Set up Apache Spark
4141

42-
This tutorial will work with a single-node Apache Spark installation running on a Mac M1 machine. To set up Apache Spark on your machine use the following steps:
42+
This usage guide will work with a single-node Apache Spark installation running on a Mac M1 machine. To set up Apache Spark on your machine use the following steps:
4343

4444
1. Install Java and Scala, as the Apache Spark requires both to run:
4545

46-
`brew install openjdk@11
47-
brew install scala`
46+
```shell
47+
brew install openjdk@11
48+
brew install scala
49+
```
4850

49-
Before verifying your Java installation, set the `JAVA_HOME` environment variable by adding the following line to your shell profile:
51+
Before verifying your Java installation, set the `JAVA_HOME` environment variable by adding the following line to your shell profile:
5052

51-
`export JAVA_HOME="/usr/local/opt/openjdk@11"`
53+
`export JAVA_HOME="/usr/local/opt/openjdk@11"`
5254

53-
2. To install the latest version of Apache Spark (which includes PySpark) run:
55+
2. Install the latest version of Apache Spark (which includes PySpark):
5456

55-
`brew install apache-spark`
57+
```shell
58+
brew install apache-spark
59+
```
5660

5761
3. Verify the installation of apache-spark and pyspark:
5862

59-
`spark-shell --version `
60-
`pyspark --version `
63+
```shell
64+
spark-shell --version
65+
pyspark --version
66+
```
6167

62-
4. Finally, as CrateDB communicates with Spark via JDBC, download the [Postgres JDBC driver](https://jdbc.postgresql.org/download/) in your working directory. In this tutorial, we use the `postgresql-42.6.0.jar` driver.
68+
4. Finally, as CrateDB communicates with Spark via JDBC, download the [Postgres JDBC driver](https://jdbc.postgresql.org/download/) in your working directory. In this usage guide, we use the `postgresql-42.6.0.jar` driver.
6369

6470

6571
## Data analysis

0 commit comments

Comments
 (0)