Merge pull request #5289 from FlorentinD/fix-training-docs

FlorentinD · web-flow · commit 3033764b0f5d · 2022-05-06T10:27:47.000+02:00
Fix smaller issues in training docs
diff --git a/doc/asciidoc/machine-learning/linkprediction-pipeline/training.adoc b/doc/asciidoc/machine-learning/linkprediction-pipeline/training.adoc
@@ -15,7 +15,7 @@ These graphs are internally managed and exist only for the duration of the train
 . Split the training instances using stratified k-fold cross-validation.
 The number of folds `k` can be configured using `validationFolds` in `gds.beta.pipeline.linkPrediction.configureSplit`.
 . Train each model candidate given by the <<linkprediction-adding-model-candidates,parameter space>> for each of the folds and evaluate the model on the respective validation set.
-The training process uses a logistic regression or random forest algorithm, and the evaluation uses the specified <<linkprediction-pipelines-metrics,metric>>.
+The evaluation uses the specified <<linkprediction-pipelines-metrics,metric>>.
 . Declare as winner the model with the highest average metric across the folds.
 . Re-train the winning model on the whole training set and evaluate it on both the `train` and `test` sets.
 In order to evaluate on the `test` set, the feature pipeline is first applied again as for the `train` set.
diff --git a/doc/asciidoc/machine-learning/node-property-prediction/nodeclassification-pipeline/training.adoc b/doc/asciidoc/machine-learning/node-property-prediction/nodeclassification-pipeline/training.adoc
@@ -14,7 +14,7 @@ More precisely, the training proceeds as follows:
 These graphs are internally managed and exist only for the duration of the training.
 . Split the nodes in the train graph using stratified k-fold cross-validation.
 The number of folds `k` can be configured as described in <<nodeclassification-pipelines-configure-splits, Configuring the node splits>>.
-. Each model candidate defined in the <<nodeclassification-pipelines-adding-model-candidates,parameter space>> is trained on each train set and evaluated on the respective validation set for every fold. The training process uses a logistic regression algorithm, and the evaluation uses the specified <<nodeclassification-pipeline-metrics,metric>>.
+. Each model candidate defined in the <<nodeclassification-pipelines-adding-model-candidates,parameter space>> is trained on each train set and evaluated on the respective validation set for every fold. The evaluation uses the specified primary <<nodeclassification-pipeline-metrics,metric>>.
 . Choose the best performing model according to the highest average score for the primary metric.
 . Retrain the winning model on the entire train graph.
 . Evaluate the performance of the winning model on the whole train graph as well as the test graph.
@@ -142,15 +142,7 @@ The structure of `modelInfo` is:
                 max: Float,
                 min: Float,
                 params: Map
-            },
-            {
-                avg: Float,
-                max: Float,
-                min: Float,
-                params: Map
-            },
-            ...
-            ]
+            }
         }
     }
 }