Skip to content

Commit 9320fa9

Browse files
committed
R: Refactor ML tutorial to dedicated section
1 parent ea3a8e7 commit 9320fa9

File tree

4 files changed

+53
-38
lines changed

4 files changed

+53
-38
lines changed

docs/integrate/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ prometheus/index
6363
pycaret/index
6464
pyviz/index
6565
queryzen/index
66+
r/index
6667
rill/index
6768
risingwave/index
6869
scikit-learn/index

docs/integrate/r/index.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
(r)=
2+
# R
3+
4+
```{div} .float-right
5+
[![R logo](https://www.r-project.org/Rlogo.png){height=60px loading=lazy}][R]
6+
```
7+
```{div} .clearfix
8+
```
9+
10+
:::{rubric} About
11+
:::
12+
13+
[R] is a free software environment for statistical computing and graphics.
14+
It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.
15+
16+
:::{rubric} Learn
17+
:::
18+
19+
::::{grid} 2
20+
21+
:::{grid-item-card} Statistical analysis and visualization on huge datasets
22+
:link: r-tutorial
23+
:link-type: ref
24+
Learn how to create a machine learning pipeline using R and CrateDB.
25+
:::
26+
27+
::::
28+
29+
:::{toctree}
30+
:maxdepth: 1
31+
:hidden:
32+
Tutorial <tutorial>
33+
:::
34+
35+
[R]: https://www.r-project.org/

docs/topic/ml/r.rst renamed to docs/integrate/r/tutorial.rst

Lines changed: 15 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
.. _cratedb-r:
2+
.. _r-tutorial:
23

34
==============
45
CrateDB with R
@@ -7,8 +8,7 @@ CrateDB with R
78
This integration document details how to create a Machine Learning pipeline
89
using R and CrateDB.
910

10-
Abstract
11-
========
11+
.. rubric:: Introduction
1212

1313
Statistical analysis and visualization on huge datasets is a common task many
1414
data scientists face in their day-to-day life. One common tool for doing this
@@ -22,12 +22,7 @@ statistical computations.
2222

2323
This can be accomplished with the `RPostgreSQL`_ library.
2424

25-
26-
Implementation
27-
==============
28-
29-
Set Up
30-
------
25+
.. rubric:: About
3126

3227
For this implementation, we will be using the classic `iris classification
3328
problem`_.
@@ -51,6 +46,8 @@ Using R, we want to:
5146
4. Retrieve our unclassified iris data, enrich the data with a prediction from
5247
our model, and insert the result into our iris table.
5348

49+
Setup
50+
=====
5451

5552
Prerequisites
5653
-------------
@@ -68,8 +65,8 @@ To install these libraries within R or RStudio, we can run:
6865
> install.packages("caret")
6966
7067
71-
CrateDB
72-
-------
68+
Provision data
69+
--------------
7370

7471
First, we need to create a table to hold our training data, as well as our
7572
unclassified irises:
@@ -112,9 +109,11 @@ We can verify that the data has been successfully imported like so:
112109
+----------+
113110
SELECT 1 row in set (0.130 sec)
114111
112+
Usage
113+
=====
115114

116-
Examining The Data
117-
------------------
115+
Explore data
116+
------------
118117

119118
With our data in CrateDB, we can now load it into R or RStudio. Within
120119
R, we should first import our data. We do this by loading the ``RPostgreSQL``
@@ -186,8 +185,8 @@ As we can see, the lengths and widths of sepals and petals are very good
186185
indicators of iris species, with little overlap between them.
187186

188187

189-
Training A Model
190-
----------------
188+
Train model
189+
-----------
191190

192191
Now that we have loaded our data and can visualize it to get a better idea of
193192
what it contains, we can create a machine learning model to predict a species
@@ -287,8 +286,8 @@ misclassified a *versicolor* as a *virginica* and vice versa. We could improve
287286
this by trying out other models, by tweaking our model, or by training on a
288287
larger dataset.
289288

290-
Enriching Data
291-
..............
289+
Enrich data
290+
-----------
292291

293292
Now that we have a model we are happy with, we can use this model to enrich
294293
unclassified iris flowers data.

docs/topic/ml/index.md

Lines changed: 2 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -71,31 +71,11 @@ See the dedicated page: {ref}`pycaret`.
7171
:::
7272

7373

74-
(iris-r)=
7574
### R
76-
77-
Use R with CrateDB.
78-
79-
:::::{info-card}
80-
::::{grid-item}
81-
:columns: 9
82-
**Statistical analysis and visualization on huge datasets**
83-
84-
Details about how to create a machine learning pipeline
85-
using R and CrateDB.
86-
87-
:::{toctree}
88-
:maxdepth: 1
89-
90-
r
75+
:::{seealso}
76+
Please navigate to the dedicated page about {ref}`r`.
9177
:::
9278

93-
::::
94-
::::{grid-item}
95-
:columns: 3
96-
{tags-primary}`Fundamentals`
97-
::::
98-
:::::
9979

10080

10181
### scikit-learn

0 commit comments

Comments
 (0)