Skip to content

Commit d2d64cf

Browse files
authored
[HWORKS-2190][4.3][APPEND] Updating job configuration to include file, pyfiles, archives and jars (#478) (#492)
* updating docs for jobs configs to include files, pyFiles, jars and archives * updating based on review comments * updating documentation for notebooks and python Jobs
1 parent 0940949 commit d2d64cf

File tree

4 files changed

+13
-2
lines changed

4 files changed

+13
-2
lines changed

docs/user_guides/projects/jobs/notebook_job.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,7 @@ The following table describes the JSON payload returned by `jobs_api.get_configu
185185
| `resourceConfig.gpus` | number (int) | Number of GPUs to be allocated | `0` |
186186
| `logRedirection` | boolean | Whether logs are redirected | `true` |
187187
| `jobType` | string | Type of job | `"PYTHON"` |
188+
| `files` | string | HDFS path(s) to files to be provided to the Notebook Job. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/file1.py,hdfs:///Project/<project_name>/Resources/file2.txt"` | `null` |
188189

189190

190191
## Accessing project data

docs/user_guides/projects/jobs/pyspark_job.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -221,7 +221,7 @@ The following table describes the JSON payload returned by `jobs_api.get_configu
221221
| Field | Type | Description | Default |
222222
| ------------------------------------------ | -------------- |-----------------------------------------------------| -------------------------- |
223223
| `type` | string | Type of the job configuration | `"sparkJobConfiguration"` |
224-
| `appPath` | string | Project path to script (e.g `Resources/foo.py`) | `null` |
224+
| `appPath` | string | Project path to script (e.g `Resources/foo.py`) | `null` |
225225
| `environmentName` | string | Name of the project spark environment | `"spark-feature-pipeline"` |
226226
| `spark.driver.cores` | number (float) | Number of CPU cores allocated for the driver | `1.0` |
227227
| `spark.driver.memory` | number (int) | Memory allocated for the driver (in MB) | `2048` |
@@ -233,6 +233,10 @@ The following table describes the JSON payload returned by `jobs_api.get_configu
233233
| `spark.dynamicAllocation.maxExecutors` | number (int) | Maximum number of executors with dynamic allocation | `2` |
234234
| `spark.dynamicAllocation.initialExecutors` | number (int) | Initial number of executors with dynamic allocation | `1` |
235235
| `spark.blacklist.enabled` | boolean | Whether executor/node blacklisting is enabled | `false` |
236+
| `files` | string | HDFS path(s) to files to be provided to the Spark application. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/file1.py,hdfs:///Project/<project_name>/Resources/file2.txt"` | `null` |
237+
| `pyFiles` | string | HDFS path(s) to Python files to be provided to the Spark application. These will be added to the `PYTHONPATH` so they can be imported as modules. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/module1.py,hdfs:///Project/<project_name>/Resources/module2.py"` | `null` |
238+
| `jars` | string | HDFS path(s) to JAR files to be provided to the Spark application. These will be added to the classpath. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/lib1.jar,hdfs:///Project/<project_name>/Resources/lib2.jar"` | `null` |
239+
| `archives` | string | HDFS path(s) to archive files to be provided to the Spark application. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/archive1.zip,hdfs:///Project/<project_name>/Resources/archive2.tar.gz"` | `null` |
236240

237241

238242
## Accessing project data

docs/user_guides/projects/jobs/python_job.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,6 +183,7 @@ The following table describes the JSON payload returned by `jobs_api.get_configu
183183
| `resourceConfig.gpus` | number (int) | Number of GPUs to be allocated | `0` |
184184
| `logRedirection` | boolean | Whether logs are redirected | `true` |
185185
| `jobType` | string | Type of job | `"PYTHON"` |
186+
| `files` | string | HDFS path(s) to files to be provided to the Python Job. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/file1.py,hdfs:///Project/<project_name>/Resources/file2.txt"` | `null` |
186187

187188

188189
## Accessing project data

docs/user_guides/projects/jobs/spark_job.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -234,7 +234,12 @@ The following table describes the JSON payload returned by `jobs_api.get_configu
234234
| `spark.dynamicAllocation.minExecutors` | number (int) | Minimum number of executors with dynamic allocation | `1` |
235235
| `spark.dynamicAllocation.maxExecutors` | number (int) | Maximum number of executors with dynamic allocation | `2` |
236236
| `spark.dynamicAllocation.initialExecutors` | number (int) | Initial number of executors with dynamic allocation | `1` |
237-
| `spark.blacklist.enabled` | boolean | Whether executor/node blacklisting is enabled | `false` |
237+
| `spark.blacklist.enabled` | boolean | Whether executor/node blacklisting is enabled | `false`
238+
| `files` | string | HDFS path(s) to files to be provided to the Spark application. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/file1.py,hdfs:///Project/<project_name>/Resources/file2.txt"` | `null` |
239+
| `pyFiles` | string | HDFS path(s) to Python files to be provided to the Spark application. These will be added to the `PYTHONPATH` so they can be imported as modules. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/module1.py,hdfs:///Project/<project_name>/Resources/module2.py"` | `null` |
240+
| `jars` | string | HDFS path(s) to JAR files to be provided to the Spark application. These will be added to the classpath. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/lib1.jar,hdfs:///Project/<project_name>/Resources/lib2.jar"` | `null` |
241+
| `archives` | string | HDFS path(s) to archive files to be provided to the Spark application. Multiple files can be included in a single string, separated by commas. <br>Example: `"hdfs:///Project/<project_name>/Resources/archive1.zip,hdfs:///Project/<project_name>/Resources/archive2.tar.gz"` | `null` |
242+
238243

239244
## Accessing project data
240245

0 commit comments

Comments
 (0)