Skip to content

Commit 7094bf4

Browse files
FlorentinDMats-SX
andcommitted
Document node regression pipeline create proc
Co-authored-by: Mats Rydberg <mats@neotechnology.com>
1 parent 3010f96 commit 7094bf4

File tree

6 files changed

+182
-1
lines changed

6 files changed

+182
-1
lines changed
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
/*
2+
* Copyright (c) "Neo4j"
3+
* Neo4j Sweden AB [http://neo4j.com]
4+
*
5+
* This file is part of Neo4j.
6+
*
7+
* Neo4j is free software: you can redistribute it and/or modify
8+
* it under the terms of the GNU General Public License as published by
9+
* the Free Software Foundation, either version 3 of the License, or
10+
* (at your option) any later version.
11+
*
12+
* This program is distributed in the hope that it will be useful,
13+
* but WITHOUT ANY WARRANTY; without even the implied warranty of
14+
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15+
* GNU General Public License for more details.
16+
*
17+
* You should have received a copy of the GNU General Public License
18+
* along with this program. If not, see <http://www.gnu.org/licenses/>.
19+
*/
20+
package org.neo4j.gds.doc;
21+
22+
import org.junit.jupiter.api.AfterAll;
23+
import org.neo4j.gds.ml.pipeline.PipelineCatalog;
24+
import org.neo4j.gds.ml.pipeline.node.regression.configure.NodeRegressionPipelineCreateProc;
25+
26+
import java.util.List;
27+
28+
class NodeRegressionPipelineDocTest extends DocTestBase {
29+
30+
@AfterAll
31+
static void tearDown() {
32+
PipelineCatalog.removeAll();
33+
}
34+
35+
@Override
36+
protected List<Class<?>> procedures() {
37+
return List.of(
38+
NodeRegressionPipelineCreateProc.class
39+
);
40+
}
41+
42+
@Override
43+
protected String adocFile() {
44+
return "machine-learning/node-property-prediction/noderegression-pipeline/noderegression.adoc";
45+
}
46+
}
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
/*
2+
* Copyright (c) "Neo4j"
3+
* Neo4j Sweden AB [http://neo4j.com]
4+
*
5+
* This file is part of Neo4j.
6+
*
7+
* Neo4j is free software: you can redistribute it and/or modify
8+
* it under the terms of the GNU General Public License as published by
9+
* the Free Software Foundation, either version 3 of the License, or
10+
* (at your option) any later version.
11+
*
12+
* This program is distributed in the hope that it will be useful,
13+
* but WITHOUT ANY WARRANTY; without even the implied warranty of
14+
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15+
* GNU General Public License for more details.
16+
*
17+
* You should have received a copy of the GNU General Public License
18+
* along with this program. If not, see <http://www.gnu.org/licenses/>.
19+
*/
20+
package org.neo4j.gds.doc.syntax;
21+
22+
import java.util.List;
23+
24+
class NodeRegressionPipelineSyntaxTest extends SyntaxTestBase {
25+
26+
@Override
27+
protected Iterable<SyntaxModeMeta> syntaxModes() {
28+
return List.of(
29+
SyntaxModeMeta.of(SyntaxMode.PIPELINE_CREATE)
30+
);
31+
}
32+
33+
@Override
34+
protected String adocFile() {
35+
return "machine-learning/node-property-prediction/noderegression-pipeline/noderegression.adoc";
36+
}
37+
}

doc/antora/content-nav.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,7 @@
105105
**** xref:machine-learning/node-property-prediction/nodeclassification-pipelines/training/index.adoc[]
106106
**** xref:machine-learning/node-property-prediction/nodeclassification-pipelines/predict/index.adoc[]
107107
*** xref:machine-learning/node-property-prediction/noderegression-pipelines/index.adoc[]
108+
**** xref:machine-learning/node-property-prediction/noderegression-pipelines/config/index.adoc[]
108109
** xref:machine-learning/linkprediction-pipelines/index.adoc[]
109110
*** xref:machine-learning/linkprediction-pipelines/config/index.adoc[]
110111
*** xref:machine-learning/linkprediction-pipelines/training/index.adoc[]
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
[[noderegression-pipelines-config]]
2+
= Configuring the pipeline
3+
:max-trials: 10
4+
5+
This page explains how to create and configure a node classification pipeline.
6+
It consists of the following sections:
7+
8+
* <<nodregression-creating-a-pipeline, Creating the pipeline>>
9+
10+
11+
[[nodregression-creating-a-pipeline]]
12+
== Creating a pipeline
13+
14+
The first step of building a new pipeline is to create one using `gds.alpha.pipeline.nodeRegression.create`.
15+
This stores a trainable pipeline object in the pipeline catalog of type `Node regression training pipeline`.
16+
This represents a configurable pipeline that can later be invoked for training, which in turn creates a regression model.
17+
The latter is also a model which is stored in the catalog with type `NodeRegression`.
18+
19+
20+
=== Syntax
21+
22+
[.pipeline-create-syntax]
23+
--
24+
.Create pipeline syntax
25+
[source, cypher, role=noplay]
26+
----
27+
CALL gds.alpha.pipeline.nodeRegression.create(
28+
pipelineName: String
29+
)
30+
YIELD
31+
name: String,
32+
nodePropertySteps: List of Map,
33+
featureProperties: List of String,
34+
splitConfig: Map,
35+
autoTuningConfig: Map,
36+
parameterSpace: List of Map
37+
----
38+
39+
.Parameters
40+
[opts="header",cols="1,1,4"]
41+
|===
42+
| Name | Type | Description
43+
| pipelineName | String | The name of the created pipeline.
44+
|===
45+
46+
include::../pipelineInfoResult.adoc[]
47+
--
48+
49+
50+
=== Example
51+
52+
[role=query-example,group=nr]
53+
--
54+
.The following will create a pipeline:
55+
[source, cypher, role=noplay]
56+
----
57+
CALL gds.alpha.pipeline.nodeRegression.create('pipe')
58+
----
59+
60+
.Results
61+
[opts="header",cols="1,1,1,1,1,1"]
62+
|===
63+
| name | nodePropertySteps | featureProperties | splitConfig | autoTuningConfig | parameterSpace
64+
| "pipe" | [] | []
65+
| {testFraction=0.3, validationFolds=3}
66+
| {maxTrials={max-trials}}
67+
| {RandomForest=[], LinearRegression=[]}
68+
|===
69+
--
70+
71+
This shows that the newly created pipeline does not contain any steps yet, and has defaults for the split and train parameters.
Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,28 @@
11
[[noderegression-pipelines]]
22
= Node regression pipelines
3+
:max-trials: 10
34

4-
// TODO add config content
5+
[abstract]
6+
--
7+
This section describes Node regression pipelines in the Neo4j Graph Data Science library.
8+
--
9+
10+
11+
Node Regression is a common machine learning task applied to graphs: training models to predict node property values.
12+
Concretely, Node Regression models are used to predict the value of node property based on other node properties.
13+
During training, the property to predict is referred to as the target property.
14+
15+
In GDS, we have Node Regression pipelines which offer an end-to-end workflow, from feature extraction to predicting node property values.
16+
The training pipelines reside in the <<pipeline-catalog-ops,pipeline catalog>>.
17+
When a training pipeline is <<nodeclassification-pipelines-train,executed>>, a regression model is created and stored in the <<model-catalog-ops,model catalog>>.
18+
19+
A training pipeline is a sequence of two phases:
20+
[upperroman]
21+
. The graph is augmented with new node properties in a series of steps.
22+
. The augmented graph is used for training a node regression model.
23+
24+
This segment is divided into the following pages:
25+
26+
* <<noderegression-pipelines-config, Configuring the pipeline>>
27+
28+
include::config.adoc[leveloffset =+ 1]

doc/docbook/content-map.xml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -305,6 +305,8 @@
305305
</d:tocentry>
306306
</d:tocentry>
307307
<d:tocentry linkend="noderegression-pipelines"><?dbhtml filename="machine-learning/node-property-prediction/noderegression-pipelines/index.html"?>
308+
<d:tocentry linkend="noderegression-pipelines-config"><?dbhtml filename="machine-learning/node-property-prediction/noderegression-pipelines/config/index.html"?>
309+
</d:tocentry>
308310
</d:tocentry>
309311
</d:tocentry>
310312

0 commit comments

Comments
 (0)