R: Refactor ML tutorial to dedicated section

amotl · amotl · commit 9320fa94db7d · 2025-09-16T11:27:15.000+02:00
diff --git a/docs/integrate/index.md b/docs/integrate/index.md
@@ -63,6 +63,7 @@ prometheus/index
 pycaret/index
 pyviz/index
 queryzen/index
+r/index
 rill/index
 risingwave/index
 scikit-learn/index
diff --git a/docs/integrate/r/index.md b/docs/integrate/r/index.md
@@ -0,0 +1,35 @@
+(r)=
+# R
+
+```{div} .float-right
+[![R logo](https://www.r-project.org/Rlogo.png){height=60px loading=lazy}][R]
+```
+```{div} .clearfix
+```
+
+:::{rubric} About
+:::
+
+[R] is a free software environment for statistical computing and graphics.
+It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.
+
+:::{rubric} Learn
+:::
+
+::::{grid} 2
+
+:::{grid-item-card} Statistical analysis and visualization on huge datasets
+:link: r-tutorial
+:link-type: ref
+Learn how to create a machine learning pipeline using R and CrateDB.
+:::
+
+::::
+
+:::{toctree}
+:maxdepth: 1
+:hidden:
+Tutorial <tutorial>
+:::
+
+[R]: https://www.r-project.org/
diff --git a/docs/integrate/r/tutorial.rst b/docs/integrate/r/tutorial.rst
@@ -1,4 +1,5 @@
 .. _cratedb-r:
+.. _r-tutorial:
 
 ==============
 CrateDB with R
@@ -7,8 +8,7 @@ CrateDB with R
 This integration document details how to create a Machine Learning pipeline
 using R and CrateDB.
 
-Abstract
-========
+.. rubric:: Introduction
 
 Statistical analysis and visualization on huge datasets is a common task many
 data scientists face in their day-to-day life. One common tool for doing this
@@ -22,12 +22,7 @@ statistical computations.
 
 This can be accomplished with the `RPostgreSQL`_ library.
 
-
-Implementation
-==============
-
-Set Up
-------
+.. rubric:: About
 
 For this implementation, we will be using the classic `iris classification
 problem`_.
@@ -51,6 +46,8 @@ Using R, we want to:
 4. Retrieve our unclassified iris data, enrich the data with a prediction from
    our model, and insert the result into our iris table.
 
+Setup
+=====
 
 Prerequisites
 -------------
@@ -68,8 +65,8 @@ To install these libraries within R or RStudio, we can run:
     > install.packages("caret")
 
 
-CrateDB
--------
+Provision data
+--------------
 
 First, we need to create a table to hold our training data, as well as our
 unclassified irises:
@@ -112,9 +109,11 @@ We can verify that the data has been successfully imported like so:
     +----------+
     SELECT 1 row in set (0.130 sec)
 
+Usage
+=====
 
-Examining The Data
-------------------
+Explore data
+------------
 
 With our data in CrateDB, we can now load it into R or RStudio. Within
 R, we should first import our data. We do this by loading the ``RPostgreSQL``
@@ -186,8 +185,8 @@ As we can see, the lengths and widths of sepals and petals are very good
 indicators of iris species, with little overlap between them.
 
 
-Training A Model
-----------------
+Train model
+-----------
 
 Now that we have loaded our data and can visualize it to get a better idea of
 what it contains, we can create a machine learning model to predict a species
@@ -287,8 +286,8 @@ misclassified a *versicolor* as a *virginica* and vice versa. We could improve
 this by trying out other models, by tweaking our model, or by training on a
 larger dataset.
 
-Enriching Data
-..............
+Enrich data
+-----------
 
 Now that we have a model we are happy with, we can use this model to enrich
 unclassified iris flowers data.
diff --git a/docs/topic/ml/index.md b/docs/topic/ml/index.md
@@ -71,31 +71,11 @@ See the dedicated page: {ref}`pycaret`.
 :::
 
 
-(iris-r)=
 ### R
-
-Use R with CrateDB.
-
-:::::{info-card}
-::::{grid-item}
-:columns: 9
-**Statistical analysis and visualization on huge datasets**
-
-Details about how to create a machine learning pipeline
-using R and CrateDB.
-
-:::{toctree}
-:maxdepth: 1
-
-r
+:::{seealso}
+Please navigate to the dedicated page about {ref}`r`.
 :::
 
-::::
-::::{grid-item}
-:columns: 3
-{tags-primary}`Fundamentals`
-::::
-:::::
 
 
 ### scikit-learn