@@ -31,21 +31,21 @@ The ML workspace is an all-in-one web-based IDE specialized for machine learning
## Highlights
-- 💫 Jupyter, JupyterLab, and Visual Studio Code web-based IDEs.
-- 🗃 Pre-installed with many popular data science libraries & tools.
-- 🖥 Full Linux desktop GUI accessible via web browser.
-- 🔀 Seamless Git integration optimized for notebooks.
-- 📈 Integrated hardware & training monitoring via Tensorboard & Netdata.
-- 🚪 Access from anywhere via Web, SSH, or VNC under a single port.
-- 🎛 Usable as remote kernel (Jupyter) or remote machine (VS Code) via SSH.
-- 🐳 Easy to deploy on Mac, Linux, and Windows via Docker.
+- 💫 Jupyter, JupyterLab, and Visual Studio Code web-based IDEs.
+- 🗃 Pre-installed with many popular data science libraries & tools.
+- 🖥 Full Linux desktop GUI accessible via web browser.
+- 🔀 Seamless Git integration optimized for notebooks.
+- 📈 Integrated hardware & training monitoring via Tensorboard & Netdata.
+- 🚪 Access from anywhere via Web, SSH, or VNC under a single port.
+- 🎛 Usable as remote kernel (Jupyter) or remote machine (VS Code) via SSH.
+- 🐳 Easy to deploy on Mac, Linux, and Windows via Docker.
## Getting Started
-
+
### Prerequisites
@@ -57,7 +57,7 @@ The workspace requires **Docker** to be installed on your machine ([📖 Install
Deploying a single workspace instance is as simple as:
```bash
-docker run -p 8080:8080 mltooling/ml-workspace:latest
+docker run -p 8080:8080 mltooling/ml-workspace:0.13.2
```
Voilà, that was easy! Now, Docker will pull the latest workspace image to your machine. This may take a few minutes, depending on your internet speed. Once the workspace is started, you can access it via http://localhost:8080.
@@ -69,11 +69,12 @@ To deploy a single instance for productive usage, we recommend to apply at least
```bash
docker run -d \
-p 8080:8080 \
- --name "ml-workspace" -v "${PWD}:/workspace" \
+ --name "ml-workspace" \
+ -v "${PWD}:/workspace" \
--env AUTHENTICATE_VIA_JUPYTER="mytoken" \
--shm-size 512m \
--restart always \
- mltooling/ml-workspace:latest
+ mltooling/ml-workspace:0.13.2
```
This command runs the container in background (`-d`), mounts your current working directory into the `/workspace` folder (`-v`), secures the workspace via a provided token (`--env AUTHENTICATE_VIA_JUPYTER`), provides 512MB of shared memory (`--shm-size`) to prevent unexpected crashes (see [known issues section](#known-issues)), and keeps the container running even on system restarts (`--restart always`). You can find additional options for docker run [here](https://docs.docker.com/engine/reference/commandline/run/) and workspace configuration options in [the section below](#Configuration).
@@ -128,7 +129,7 @@ The workspace provides a variety of configuration options that can be used by se
INCLUDE_TUTORIALS
-
If true, a selection of tutorial and introduction notebooks are added to the /workspace folder at container startup, but only in if the folder is empty.
+
If true, a selection of tutorial and introduction notebooks are added to the /workspace folder at container startup, but only if the folder is empty.
true
@@ -172,6 +173,8 @@ The default work directory within the container is `/workspace`, which is also t
We strongly recommend enabling authentication via one of the following two options. For both options, the user will be required to authenticate for accessing any of the pre-installed tools.
+> _The authentication only works for all tools accessed through the main workspace port (default: `8080`). This works for all preinstalled tools and the [Access Ports](#access-ports) feature. If you expose another port of the container, please make sure to secure it with authentication as well!_
+
Details (click to expand...)
@@ -180,7 +183,7 @@ We strongly recommend enabling authentication via one of the following two optio
Activate the token-based authentication based on the authentication implementation of Jupyter via the `AUTHENTICATE_VIA_JUPYTER` variable:
```bash
-docker run -p 8080:8080 --env AUTHENTICATE_VIA_JUPYTER="mytoken" mltooling/ml-workspace:latest
+docker run -p 8080:8080 --env AUTHENTICATE_VIA_JUPYTER="mytoken" mltooling/ml-workspace:0.13.2
```
You can also use `` to let Jupyter generate a random token that is printed out on the container logs. A value of `true` will not set any token but activate that every request to any tool in the workspace will be checked with the Jupyter instance if the user is authenticated. This is used for tools like JupyterHub, which configures its own way of authentication.
@@ -190,7 +193,7 @@ You can also use `` to let Jupyter generate a random token that is pr
Activate the basic authentication via the `WORKSPACE_AUTH_USER` and `WORKSPACE_AUTH_PASSWORD` variable:
```bash
-docker run -p 8080:8080 --env WORKSPACE_AUTH_USER="user" --env WORKSPACE_AUTH_PASSWORD="pwd" mltooling/ml-workspace:latest
+docker run -p 8080:8080 --env WORKSPACE_AUTH_USER="user" --env WORKSPACE_AUTH_PASSWORD="pwd" mltooling/ml-workspace:0.13.2
```
The basic authentication is configured via the nginx proxy and might be more performant compared to the other option since with `AUTHENTICATE_VIA_JUPYTER` every request to any tool in the workspace will check via the Jupyter instance if the user (based on the request cookies) is authenticated.
@@ -211,7 +214,7 @@ docker run \
-p 8080:8080 \
--env WORKSPACE_SSL_ENABLED="true" \
-v /path/with/certificate/files:/resources/ssl:ro \
- mltooling/ml-workspace:latest
+ mltooling/ml-workspace:0.13.2
```
If you want to host the workspace on a public domain, we recommend to use [Let's encrypt](https://letsencrypt.org/getting-started/) to get a trusted certificate for your domain. To use the generated certificate (e.g., via [certbot](https://certbot.eff.org/) tool) for the workspace, the `privkey.pem` corresponds to the `cert.key` file and the `fullchain.pem` to the `cert.crt` file.
@@ -232,7 +235,7 @@ By default, the workspace container has no resource constraints and can use as m
For example, the following command restricts the workspace to only use a maximum of 8 CPUs, 16 GB of memory, and 1 GB of shared memory (see [Known Issues](#known-issues)):
```bash
-docker run -p 8080:8080 --cpus=8 --memory=16g --shm-size=1G mltooling/ml-workspace:latest
+docker run -p 8080:8080 --cpus=8 --memory=16g --shm-size=1G mltooling/ml-workspace:0.13.2
```
> 📖 _For more options and documentation on resource constraints, please refer to the [official docker guide](https://docs.docker.com/config/containers/resource_constraints/)._
@@ -250,10 +253,9 @@ In addition to the main workspace image (`mltooling/ml-workspace`), we provide o
#### Minimal Flavor
-
-
+
+
-
@@ -262,17 +264,16 @@ In addition to the main workspace image (`mltooling/ml-workspace`), we provide o
The minimal flavor (`mltooling/ml-workspace-minimal`) is our smallest image that contains most of the tools and features described in the [features section](#features) without most of the python libraries that are pre-installed in our main image. Any Python library or excluded tool can be installed manually during runtime by the user.
```bash
-docker run -p 8080:8080 mltooling/ml-workspace-minimal:latest
+docker run -p 8080:8080 mltooling/ml-workspace-minimal:0.13.2
```
#### R Flavor
-
-
+
+
-
@@ -281,26 +282,25 @@ docker run -p 8080:8080 mltooling/ml-workspace-minimal:latest
The R flavor (`mltooling/ml-workspace-r`) is based on our default workspace image and extends it with the R-interpreter, R-Jupyter kernel, RStudio server (access via `Open Tool -> RStudio`), and a variety of popular packages from the R ecosystem.
```bash
-docker run -p 8080:8080 mltooling/ml-workspace-r:latest
+docker run -p 8080:8080 mltooling/ml-workspace-r:0.12.1
```
#### Spark Flavor
-
-
+
+
-
Details (click to expand...)
-The Spark flavor (`mltooling/ml-workspace-spark`) is based on our R-flavor workspace image and extends it with the Spark-interpreter, Spark-Jupyter kernel (Apache Toree), Zeppelin Notebook (access via `Open Tool -> Zeppelin`), and a few additional python libraries & Jupyter extensions.
+The Spark flavor (`mltooling/ml-workspace-spark`) is based on our R-flavor workspace image and extends it with the Spark runtime, Spark-Jupyter kernel, Zeppelin Notebook (access via `Open Tool -> Zeppelin`), PySpark, Hadoop, Java Kernel, and a few additional libraries & Jupyter extensions.
```bash
-docker run -p 8080:8080 mltooling/ml-workspace-spark:latest
+docker run -p 8080:8080 mltooling/ml-workspace-spark:0.12.1
```
@@ -308,30 +308,29 @@ docker run -p 8080:8080 mltooling/ml-workspace-spark:latest
#### GPU Flavor
-
-
+
+
-
Details (click to expand...)
-> _Currently, the GPU-flavor only supports CUDA 10.1. Support for other CUDA versions might be added in the future._
+> _Currently, the GPU-flavor only supports CUDA 11.2. Support for other CUDA versions might be added in the future._
The GPU flavor (`mltooling/ml-workspace-gpu`) is based on our default workspace image and extends it with CUDA 10.1 and GPU-ready versions of various machine learning libraries (e.g., tensorflow, pytorch, cntk, jax). This GPU image has the following additional requirements for the system:
-- Nvidia Drivers for the GPUs. Drivers need to be CUDA 10.1 compatible, version `>= 418.39` ([📖 Instructions](https://github.com/NVIDIA/nvidia-docker/wiki/Frequently-Asked-Questions#how-do-i-install-the-nvidia-driver)).
+- Nvidia Drivers for the GPUs. Drivers need to be CUDA 11.2 compatible, version `>=460.32.03` ([📖 Instructions](https://github.com/NVIDIA/nvidia-docker/wiki/Frequently-Asked-Questions#how-do-i-install-the-nvidia-driver)).
- (Docker >= 19.03) Nvidia Container Toolkit ([📖 Instructions](https://github.com/NVIDIA/nvidia-docker/wiki/Installation-(Native-GPU-Support))).
```bash
-docker run -p 8080:8080 --gpus all mltooling/ml-workspace-gpu:latest
+docker run -p 8080:8080 --gpus all mltooling/ml-workspace-gpu:0.13.2
```
- (Docker < 19.03) Nvidia Docker 2.0 ([📖 Instructions](https://github.com/NVIDIA/nvidia-docker/wiki/Installation-(version-2.0))).
```bash
-docker run -p 8080:8080 --runtime nvidia --env NVIDIA_VISIBLE_DEVICES="all" mltooling/ml-workspace-gpu:latest
+docker run -p 8080:8080 --runtime nvidia --env NVIDIA_VISIBLE_DEVICES="all" mltooling/ml-workspace-gpu:0.13.2
```
The GPU flavor also comes with a few additional configuration options, as explained below:
@@ -383,17 +382,15 @@ For more information and documentation about ML Hub, please take a look at the [
## Support
-The ML Workspace project is maintained by [Lukas Masuch](https://twitter.com/LukasMasuch)
-and [Benjamin Räthlein](https://twitter.com/raethlein). Please understand that we won't be able
-to provide individual support via email. We also believe that help is much more
-valuable if it's shared publicly so that more people can benefit from it.
+This project is maintained by [Benjamin Räthlein](https://twitter.com/raethlein), [Lukas Masuch](https://twitter.com/LukasMasuch), and [Jan Kalkan](https://www.linkedin.com/in/jan-kalkan-b5390284/). Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly so that more people can benefit from it.
| Type | Channel |
| ------------------------ | ------------------------------------------------------ |
-| 🚨 **Bug Reports** | |
-| 🎁 **Feature Requests** | |
-| 👩💻 **Usage Questions** | |
-| 🗯 **General Discussion** | |
+| 🚨 **Bug Reports** | |
+| 🎁 **Feature Requests** | |
+| 👩💻 **Usage Questions** | |
+| 📢 **Announcements** | |
+| ❓ **Other Requests** | |
---
@@ -419,7 +416,7 @@ valuable if it's shared publicly so that more people can benefit from it.
The workspace is equipped with a selection of best-in-class open-source development tools to help with the machine learning workflow. Many of these tools can be started from the `Open Tool` menu from Jupyter (the main application of the workspace):
-
+
> _Within your workspace you have **full root & sudo privileges** to install any library or tool you need via terminal (e.g., `pip`, `apt-get`, `conda`, or `npm`). You can find more ways to extend the workspace within the [Extensibility](#extensibility) section_
@@ -427,13 +424,13 @@ The workspace is equipped with a selection of best-in-class open-source developm
[Jupyter Notebook](https://jupyter.org/) is a web-based interactive environment for writing and running code. The main building blocks of Jupyter are the file-browser, the notebook editor, and kernels. The file-browser provides an interactive file manager for all notebooks, files, and folders in the `/workspace` directory.
-
+
A new notebook can be created by clicking on the `New` drop-down button at the top of the list and selecting the desired language kernel.
> _You can spawn interactive **terminal** instances as well by selecting `New -> Terminal` in the file-browser._
-
+
The notebook editor enables users to author documents that include live code, markdown text, shell commands, LaTeX equations, interactive widgets, plots, and images. These notebook documents provide a complete and self-contained record of a computation that can be converted to various formats and shared with others.
@@ -447,13 +444,13 @@ The Notebook allows code to be run in a range of different programming languages
This workspace provides an HTTP-based VNC access to the workspace via [noVNC](https://github.com/novnc/noVNC). Thereby, you can access and work within the workspace with a fully-featured desktop GUI. To access this desktop GUI, go to `Open Tool`, select `VNC`, and click the `Connect` button. In the case you are asked for a password, use `vncpassword`.
-
+
Once you are connected, you will see a desktop GUI that allows you to install and use full-fledged web-browsers or any other tool that is available for Ubuntu. Within the `Tools` folder on the desktop, you will find a collection of install scripts that makes it straightforward to install some of the most commonly used development tools, such as Atom, PyCharm, R-Runtime, R-Studio, or Postman (just double-click on the script).
**Clipboard:** If you want to share the clipboard between your machine and the workspace, you can use the copy-paste functionality as described below:
-
+
> 💡 _**Long-running tasks:** Use the desktop GUI for long-running Jupyter executions. By running notebooks from the browser of your workspace desktop GUI, all output will be synchronized to the notebook even if you have disconnected your browser from the notebook._
@@ -461,17 +458,17 @@ Once you are connected, you will see a desktop GUI that allows you to install an
[Visual Studio Code](https://github.com/microsoft/vscode) (`Open Tool -> VS Code`) is an open-source lightweight but powerful code editor with built-in support for a variety of languages and a rich ecosystem of extensions. It combines the simplicity of a source code editor with powerful developer tooling, like IntelliSense code completion and debugging. The workspace integrates VS Code as a web-based application accessible through the browser-based on the awesome [code-server](https://github.com/cdr/code-server) project. It allows you to customize every feature to your liking and install any number of third-party extensions.
-
+
The workspace also provides a VS Code integration into Jupyter allowing you to open a VS Code instance for any selected folder, as shown below:
-
+
### JupyterLab
[JupyterLab](https://github.com/jupyterlab/jupyterlab) (`Open Tool -> JupyterLab`) is the next-generation user interface for Project Jupyter. It offers all the familiar building blocks of the classic Jupyter Notebook (notebook, terminal, text editor, file browser, rich outputs, etc.) in a flexible and powerful user interface. This JupyterLab instance comes pre-installed with a few helpful extensions such as a the [jupyterlab-toc](https://github.com/jupyterlab/jupyterlab-toc), [jupyterlab-git](https://github.com/jupyterlab/jupyterlab-git), and [juptyterlab-tensorboard](https://github.com/chaoleili/jupyterlab_tensorboard).
-
+
### Git Integration
@@ -481,17 +478,17 @@ Version control is a crucial aspect of productive collaboration. To make this pr
For cloning repositories via `https`, we recommend to navigate to the desired root folder and to click on the `git` button as shown below:
-
+
This might ask for some required settings and, subsequently, opens [ungit](https://github.com/FredrikNoren/ungit), a web-based Git client with a clean and intuitive UI that makes it convenient to sync your code artifacts. Within ungit, you can clone any repository. If authentication is required, you will get asked for your credentials.
-
+
#### Push, Pull, Merge, and Other Git Actions
To commit and push a single notebook to a remote Git repository, we recommend to use the Git plugin integrated into Jupyter, as shown below:
-
+
For more advanced Git operations, we recommend to use [ungit](https://github.com/FredrikNoren/ungit). With ungit, you can do most of the common git actions such as push, pull, merge, branch, tag, checkout, and many more.
@@ -499,11 +496,11 @@ For more advanced Git operations, we recommend to use [ungit](https://github.com
Jupyter notebooks are great, but they often are huge files, with a very specific JSON file format. To enable seamless diffing and merging via Git this workspace is pre-installed with [nbdime](https://github.com/jupyter/nbdime). Nbdime understands the structure of notebook documents and, therefore, automatically makes intelligent decisions when diffing and merging notebooks. In the case you have merge conflicts, nbdime will make sure that the notebook is still readable by Jupyter, as shown below:
-
+
Furthermore, the workspace comes pre-installed with [jupytext](https://github.com/mwouts/jupytext), a Jupyter plugin that reads and writes notebooks as plain text files. This allows you to open, edit, and run scripts or markdown files (e.g., `.py`, `.md`) as notebooks within Jupyter. In the following screenshot, we have opened a markdown file via Jupyter:
-
+
In combination with Git, jupytext enables a clear diff history and easy merging of version conflicts. With both of those tools, collaborating on Jupyter notebooks with Git becomes straightforward.
@@ -511,11 +508,11 @@ In combination with Git, jupytext enables a clear diff history and easy merging
The workspace has a feature to share any file or folder with anyone via a token-protected link. To share data via a link, select any file or folder from the Jupyter directory tree and click on the share button as shown in the following screenshot:
-
+
This will generate a unique link protected via a token that gives anyone with the link access to view and download the selected data via the [Filebrowser](https://github.com/filebrowser/filebrowser) UI:
-
+
To deactivate or manage (e.g., provide edit permissions) shared links, open the Filebrowser via `Open Tool -> Filebrowser` and select `Settings->User Management`.
@@ -523,11 +520,11 @@ To deactivate or manage (e.g., provide edit permissions) shared links, open the
It is possible to securely access any workspace internal port by selecting `Open Tool -> Access Port`. With this feature, you are able to access a REST API or web application running inside the workspace directly with your browser. The feature enables developers to build, run, test, and debug REST APIs or web applications directly from the workspace.
-
+
If you want to use an HTTP client or share access to a given port, you can select the `Get shareable link` option. This generates a token-secured link that anyone with access to the link can use to access the specified port.
-> _The HTTP app requires to be resolved from a relative URL path or configure a base path (`/tools/PORT/`)._
+> _The HTTP app requires to be resolved from a relative URL path or configure a base path (`/tools/PORT/`). Tools made accessible this way are secured by the workspace's authentication system! If you decide to publish any other port of the container yourself instead of using this feature to make a tool accessible, please make sure to secure it via an authentication mechanism!_
@@ -544,7 +541,7 @@ If you want to use an HTTP client or share access to a given port, you can selec
SSH provides a powerful set of features that enables you to be more productive with your development tasks. You can easily set up a secure and passwordless SSH connection to a workspace by selecting `Open Tool -> SSH`. This will generate a secure setup command that can be run on any Linux or Mac machine to configure a passwordless & secure SSH connection to the workspace. Alternatively, you can also download the setup script and run it (instead of using the command).
-
+
> _The setup script only runs on Mac and Linux. Windows is currently not supported._
@@ -575,10 +572,9 @@ Port tunneling is quite useful when you have started any server-based tool withi
- `8090`: Jupyter server.
- `8054`: VS Code server.
- `5901`: VNC server.
-- `3389`: RDP server.
- `22`: SSH server.
-You can find port information on all the tools in the [supervisor configuration](https://github.com/ml-tooling/ml-workspace/blob/master/resources/config/supervisord.conf).
+You can find port information on all the tools in the [supervisor configuration](https://github.com/ml-tooling/ml-workspace/blob/main/resources/supervisor/supervisord.conf).
> 📖 _For more information about port tunneling/forwarding, we recommend [this guide](https://www.everythingcli.org/ssh-tunnelling-for-fun-and-profit-local-vs-remote/)._
@@ -639,7 +635,7 @@ Once the remote directory is mounted, you can interact with the remote file syst
### Remote Development
-The workspace can be integrated and used as a remote runtime (also known as remote kernel/machine/interpreter) for a variety of popular development tools and IDEs, such as Jupyter, VS Code, PyCharm, Colab, or Atom Hydrogen. Thereby, you can connect your favorite development tool running on your local machine to a remote machine for code execution. This enables a **local-quality development experience with remote-hosted compute resources**.
+The workspace can be integrated and used as a remote runtime (also known as remote kernel/machine/interpreter) for a variety of popular development tools and IDEs, such as Jupyter, VS Code, PyCharm, Colab, or Atom Hydrogen. Thereby, you can connect your favorite development tool running on your local machine to a remote machine for code execution. This enables a local-quality development experience with remote-hosted compute resources.
These integrations usually require a passwordless SSH connection from the local machine to the workspace. To set up an SSH connection, please follow the steps explained in the [SSH Access](#ssh-access) section.
@@ -657,13 +653,13 @@ In case you want to manually setup and manage remote kernels, use the [remote_ik
remote_ikernel manage --add \
--interface=ssh \
--kernel_cmd="ipython kernel -f {connection_file}" \
- --name="ml-server Py 3.6" \
+ --name="ml-server (Python)" \
--host="my-workspace"
```
You can use the remote_ikernel command line functionality to list (`remote_ikernel manage --show`) or delete (`remote_ikernel manage --delete `) remote kernel connections.
-
+
@@ -676,7 +672,7 @@ The Visual Studio Code [Remote - SSH](https://marketplace.visualstudio.com/item
2. Run the SSH setup script of a selected workspace as explained in the [SSH Access](#ssh-access) section.
3. Open the Remote-SSH panel in your local VS Code. All configured SSH connections should be automatically discovered. Just select any configured workspace connection you like to connect to as shown below:
-
+
> 📖 _You can find additional features and information about the Remote SSH extension in [this guide](https://code.visualstudio.com/docs/remote/ssh)._
@@ -686,18 +682,18 @@ The Visual Studio Code [Remote - SSH](https://marketplace.visualstudio.com/item
[Tensorboard](https://www.tensorflow.org/tensorboard) provides a suite of visualization tools to make it easier to understand, debug, and optimize your experiment runs. It includes logging features for scalar, histogram, model structure, embeddings, and text & image visualization. The workspace comes pre-installed with [jupyter_tensorboard extension](https://github.com/lspvic/jupyter_tensorboard) that integrates Tensorboard into the Jupyter interface with functionalities to start, manage, and stop instances. You can open a new instance for a valid logs directory, as shown below:
-
+
If you have opened a Tensorboard instance in a valid log directory, you will see the visualizations of your logged data:
-
+
> _Tensorboard can be used in combination with many other ML frameworks besides Tensorflow. By using the [tensorboardX](https://github.com/lanpa/tensorboardX) library you can log basically from any python based library. Also, PyTorch has a direct Tensorboard integration as described [here](https://pytorch.org/docs/stable/tensorboard.html)._
If you prefer to see the tensorboard directly within your notebook, you can make use of following **Jupyter magic**:
```
-%load_ext tensorboard.notebook
+%load_ext tensorboard
%tensorboard --logdir /workspace/path/to/logs
```
@@ -707,11 +703,11 @@ The workspace provides two pre-installed web-based tools to help developers duri
[Netdata](https://github.com/netdata/netdata) (`Open Tool -> Netdata`) is a real-time hardware and performance monitoring dashboard that visualize the processes and services on your Linux systems. It monitors metrics about CPU, GPU, memory, disks, networks, processes, and more.
-
+
[Glances](https://github.com/nicolargo/glances) (`Open Tool -> Glances`) is a web-based hardware monitoring dashboard as well and can be used as an alternative to Netdata.
-
+
> _Netdata and Glances will show you the hardware statistics for the entire machine on which the workspace container is running._
@@ -728,10 +724,10 @@ To run Python code as a job, you need to provide a path or URL to a code directo
#### Run code from version control system
-You can execute code directly from Git, Mercurial, Subversion, or Bazaar by using the pip-vcs format as described in [this guide](https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support). For example, to execute code from a [subdirectory](https://github.com/ml-tooling/ml-workspace/tree/master/resources/tests/ml-job) of a git repository, just run:
+You can execute code directly from Git, Mercurial, Subversion, or Bazaar by using the pip-vcs format as described in [this guide](https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support). For example, to execute code from a [subdirectory](https://github.com/ml-tooling/ml-workspace/tree/main/resources/tests/ml-job) of a git repository, just run:
```bash
-docker run --env EXECUTE_CODE="git+https://github.com/ml-tooling/ml-workspace.git#subdirectory=resources/tests/ml-job" mltooling/ml-workspace:latest
+docker run --env EXECUTE_CODE="git+https://github.com/ml-tooling/ml-workspace.git#subdirectory=resources/tests/ml-job" mltooling/ml-workspace:0.13.2
```
> 📖 _For additional information on how to specify branches, commits, or tags please refer to [this guide](https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support)._
@@ -741,7 +737,7 @@ docker run --env EXECUTE_CODE="git+https://github.com/ml-tooling/ml-workspace.gi
In the following example, we mount and execute the current working directory (expected to contain our code) into the `/workspace/ml-job/` directory of the workspace:
```bash
-docker run -v "${PWD}:/workspace/ml-job/" --env EXECUTE_CODE="/workspace/ml-job/" mltooling/ml-workspace:latest
+docker run -v "${PWD}:/workspace/ml-job/" --env EXECUTE_CODE="/workspace/ml-job/" mltooling/ml-workspace:0.13.2
```
#### Install Dependencies
@@ -767,7 +763,7 @@ python /resources/scripts/execute_code.py /path/to/your/job
It is also possible to embed your code directly into a custom job image, as shown below:
```dockerfile
-FROM mltooling/ml-workspace:latest
+FROM mltooling/ml-workspace:0.13.2
# Add job code to image
COPY ml-job /workspace/ml-job
@@ -786,13 +782,13 @@ CMD ["python", "/resources/docker-entrypoint.py", "--code-only"]
The workspace is pre-installed with many popular interpreters, data science libraries, and ubuntu packages:
-- **Interpreter:** Python 3.7 (Miniconda 3), Java 11 (OpenJDK), NodeJS 13, Scala, Perl 5
-- **Python libraries:** Tensorflow, Keras, Pytorch, Sklearn, XGBoost, MXNet, Theano, and [many more](https://github.com/ml-tooling/ml-workspace/tree/master/resources/libraries)
-- **Package Manager:** `conda`, `pip`, `npm`, `apt-get`, `yarn`, `sdk`, `gdebi`, `mvn` ...
+- **Interpreter:** Python 3.8 (Miniconda 3), NodeJS 14, Scala, Perl 5
+- **Python libraries:** Tensorflow, Keras, Pytorch, Sklearn, XGBoost, MXNet, Theano, and [many more](https://github.com/ml-tooling/ml-workspace/tree/main/resources/libraries)
+- **Package Manager:** `conda`, `pip`, `apt-get`, `npm`, `yarn`, `sdk`, `poetry`, `gdebi`...
-The full list of installed tools can be found within the [Dockerfile](https://github.com/ml-tooling/ml-workspace/blob/master/Dockerfile).
+The full list of installed tools can be found within the [Dockerfile](https://github.com/ml-tooling/ml-workspace/blob/main/Dockerfile).
-> _For every minor version release, we run vulnerability, virus, and security checks within the workspace using [vuls](https://vuls.io/), [safety](https://pyup.io/safety/), and [clamav](https://www.clamav.net/) to make sure that the workspace environment is as secure as possible._
+> _For every minor version release, we run vulnerability, virus, and security checks within the workspace using [safety](https://pyup.io/safety/), [clamav](https://www.clamav.net/), [trivy](https://github.com/aquasecurity/trivy), and [snyk via docker scan](https://docs.docker.com/engine/scan/) to make sure that the workspace environment is as secure as possible. We are committed to fix and prevent all high- or critical-severity vulnerabilities. You can find some up-to-date reports [here](https://github.com/ml-tooling/ml-workspace/tree/main/resources/reports)._
### Extensibility
@@ -803,7 +799,7 @@ The workspace provides a high degree of extensibility. Within the workspace, you
- **JupyterLab:** `File -> New -> Terminal`
- **VS Code:** `Terminal -> New Terminal`
-Additionally, pre-installed tools such as Jupyter, JupyterLab, and Visual Studio Code each provide their own rich ecosystem of extensions. The workspace also contains a [collection of installer scripts](https://github.com/ml-tooling/ml-workspace/tree/master/resources/tools) for many commonly used development tools or libraries (e.g., `PyCharm`, `Zeppelin`, `RStudio`, `Starspace`). You can find and execute all tool installers via `Open Tool -> Install Tool`. Those scripts can be also executed from the Desktop VNC (double-click on the script within the `Tools` folder on the Desktop VNC).
+Additionally, pre-installed tools such as Jupyter, JupyterLab, and Visual Studio Code each provide their own rich ecosystem of extensions. The workspace also contains a [collection of installer scripts](https://github.com/ml-tooling/ml-workspace/tree/main/resources/tools) for many commonly used development tools or libraries (e.g., `PyCharm`, `Zeppelin`, `RStudio`, `Starspace`). You can find and execute all tool installers via `Open Tool -> Install Tool`. Those scripts can be also executed from the Desktop VNC (double-click on the script within the `Tools` folder on the Desktop VNC).
Example (click to expand...)
@@ -832,8 +828,7 @@ The workspace can be extended in many ways at runtime, as explained [here](#exte
```dockerfile
# Extend from any of the workspace versions/flavors
-# Using latest as version is not recommended, please specify a specific version
-FROM mltooling/ml-workspace:latest
+FROM mltooling/ml-workspace:0.13.2
# Run you customizations, e.g.
RUN \
@@ -846,12 +841,12 @@ RUN \
Finally, use [docker build](https://docs.docker.com/engine/reference/commandline/build/) to build your customized Docker image.
-> 📖 _For a more comprehensive Dockerfile example, take a look at the [Dockerfile of the R-flavor](https://github.com/ml-tooling/ml-workspace/blob/master/r-flavor/Dockerfile)._
+> 📖 _For a more comprehensive Dockerfile example, take a look at the [Dockerfile of the R-flavor](https://github.com/ml-tooling/ml-workspace/blob/main/r-flavor/Dockerfile)._
-How to update a workspace container? (click to expand...)
+How to update a running workspace container? (click to expand...)
To update a running workspace instance to a more recent version, the running Docker container needs to be replaced with a new container based on the updated workspace image.
@@ -861,10 +856,20 @@ All data within the workspace that is not persisted to a mounted volume will be
Update Example (click to expand...)
-If the workspace is deployed via Docker (Kubernetes will have a different update process), you need to remove the existing container (via `docker rm`) and start a new one (via `docker run`) with the newer workspace image. Make sure to use the same configuration, volume, name, and port. For example, a workspace (image version `0.8.3`) was started with this command: `docker run -d -p 8080:8080 --name "ml-workspace" -v "/path/on/host:/workspace" --env AUTHENTICATE_VIA_JUPYTER="mytoken" --restart always mltooling/ml-workspace:0.8.3`) and needs to be updated to version `0.8.4`, you need to:
+If the workspace is deployed via Docker (Kubernetes will have a different update process), you need to remove the existing container (via `docker rm`) and start a new one (via `docker run`) with the newer workspace image. Make sure to use the same configuration, volume, name, and port. For example, a workspace (image version `0.8.7`) was started with this command:
+```
+docker run -d \
+ -p 8080:8080 \
+ --name "ml-workspace" \
+ -v "/path/on/host:/workspace" \
+ --env AUTHENTICATE_VIA_JUPYTER="mytoken" \
+ --restart always \
+ mltooling/ml-workspace:0.8.7
+```
+and needs to be updated to version `0.9.1`, you need to:
1. Stop and remove the running workspace container: `docker stop "ml-workspace" && docker rm "ml-workspace"`
-2. Start a new workspace container with the newer image and same configuration: `docker run -d -p 8080:8080 --name "ml-workspace" -v "/path/on/host:/workspace" --env AUTHENTICATE_VIA_JUPYTER="mytoken" --restart always mltooling/ml-workspace:latest`
+2. Start a new workspace container with the newer image and same configuration: `docker run -d -p 8080:8080 --name "ml-workspace" -v "/path/on/host:/workspace" --env AUTHENTICATE_VIA_JUPYTER="mytoken" --restart always mltooling/ml-workspace:0.9.1`
@@ -909,6 +914,165 @@ Using root-user (or users with sudo permission) within containers is generally n
+
+How to create and use a virtual environment? (click to expand...)
+
+The workspace comes preinstalled with various common tools to create isolated Python environments (virtual environments). The following sections provide a quick-intro on how to use these tools within the workspace. You can find information on when to use which tool [here](https://stackoverflow.com/a/41573588). Please refer to the documentation of the given tool for additional usage information.
+
+**venv** (recommended):
+
+To create a virtual environment via [venv](https://docs.python.org/3/tutorial/venv.html), execute the following commands:
+
+```bash
+# Create environment in the working directory
+python -m venv my-venv
+# Activate environment in shell
+source ./my-venv/bin/activate
+# Optional: Create Jupyter kernel for this environment
+pip install ipykernel
+python -m ipykernel install --user --name=my-venv --display-name="my-venv ($(python --version))"
+# Optional: Close enviornment session
+deactivate
+```
+
+**pipenv** (recommended):
+
+To create a virtual environment via [pipenv](https://pipenv.pypa.io/en/latest/), execute the following commands:
+
+```bash
+# Create environment in the working directory
+pipenv install
+# Activate environment session in shell
+pipenv shell
+# Optional: Create Jupyter kernel for this environment
+pipenv install ipykernel
+python -m ipykernel install --user --name=my-pipenv --display-name="my-pipenv ($(python --version))"
+# Optional: Close environment session
+exit
+```
+
+**virtualenv**:
+
+To create a virtual environment via [virtualenv](https://virtualenv.pypa.io/en/latest/), execute the following commands:
+
+```bash
+# Create environment in the working directory
+virtualenv my-virtualenv
+# Activate environment session in shell
+source ./my-virtualenv/bin/activate
+# Optional: Create Jupyter kernel for this environment
+pip install ipykernel
+python -m ipykernel install --user --name=my-virtualenv --display-name="my-virtualenv ($(python --version))"
+# Optional: Close environment session
+deactivate
+```
+
+**conda**:
+
+To create a virtual environment via [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html), execute the following commands:
+
+```bash
+# Create environment (globally)
+conda create -n my-conda-env
+# Activate environment session in shell
+conda activate my-conda-env
+# Optional: Create Jupyter kernel for this environment
+python -m ipykernel install --user --name=my-conda-env --display-name="my-conda-env ($(python --version))"
+# Optional: Close environment session
+conda deactivate
+```
+
+**Tip: Shell Commands in Jupyter Notebooks:**
+
+If you install and use a virtual environment via a dedicated Jupyter Kernel and use shell commands within Jupyter (e.g. `!pip install matplotlib`), the wrong python/pip version will be used. To use the python/pip version of the selected kernel, do the following instead:
+
+```python
+import sys
+!{sys.executable} -m pip install matplotlib
+```
+
+
+
+
+How to install a different Python version? (click to expand...)
+
+The workspace provides three easy options to install different Python versions alongside the main Python instance: [pyenv](https://github.com/pyenv/pyenv), [pipenv](https://pipenv.pypa.io/en/latest/cli/) (recommended), [conda](https://github.com/pyenv/pyenv).
+
+**pipenv** (recommended):
+
+To install a different python version (e.g. `3.7.8`) within the workspace via [pipenv](https://pipenv.pypa.io/en/latest/cli/), execute the following commands:
+
+```bash
+# Install python vers
+pipenv install --python=3.7.8
+# Activate environment session in shell
+pipenv shell
+# Check python installation
+python --version
+# Optional: Create Jupyter kernel for this environment
+pipenv install ipykernel
+python -m ipykernel install --user --name=my-pipenv --display-name="my-pipenv ($(python --version))"
+# Optional: Close environment session
+exit
+```
+
+**pyenv**:
+
+To install a different python version (e.g. `3.7.8`) within the workspace via [pyenv](https://github.com/pyenv/pyenv), execute the following commands:
+
+```bash
+# Install python version
+pyenv install 3.7.8
+# Make globally accessible
+pyenv global 3.7.8
+# Activate python version in shell
+pyenv shell 3.7.8
+# Check python installation
+python3.7 --version
+# Optional: Create Jupyter kernel for this python version
+python3.7 -m pip install ipykernel
+python3.7 -m ipykernel install --user --name=my-pyenv-3.7.8 --display-name="my-pyenv (Python 3.7.8)"
+```
+
+**conda**:
+
+To install a different python version (e.g. `3.7.8`) within the workspace via [conda](https://github.com/pyenv/pyenv), execute the following commands:
+
+```bash
+# Create environment with python version
+conda create -n my-conda-3.7 python=3.7.8
+# Activate environment session in shell
+conda activate my-conda-3.7
+# Check python installation
+python --version
+# Optional: Create Jupyter kernel for this python version
+pip install ipykernel
+python -m ipykernel install --user --name=my-conda-3.7 --display-name="my-conda ($(python --version))"
+# Optional: Close environment session
+conda deactivate
+```
+
+**Tip: Shell Commands in Jupyter Notebooks:**
+
+If you install and use another Python version via a dedicated Jupyter Kernel and use shell commands within Jupyter (e.g. `!pip install matplotlib`), the wrong python/pip version will be used. To use the python/pip version of the selected kernel, do the following instead:
+
+```python
+import sys
+!{sys.executable} -m pip install matplotlib
+```
+
+
+
+
+Can I publish any other than the default port to access a tool inside the container? (click to expand...)
+You can do this, but please be aware that this port is not protected by the workspace's authentication mechanism then! For security reasons, we therefore highly recommend to use the Access Ports functionality of the workspace.
+
+
+
+System and Tool Translations (click to expand...)
+If you want to configure another language than English in your workspace and some tools are not translated properly, have a look at this issue. Try to comment out the 'exclude translations' line in `/etc/dpkg/dpkg.cfg.d/excludes` and re-install / configure the package.
+
+
---
@@ -922,63 +1086,102 @@ Using root-user (or users with sudo permission) within containers is generally n
Certain desktop tools (e.g., recent versions of [Firefox](https://github.com/jlesage/docker-firefox#increasing-shared-memory-size)) or libraries (e.g., Pytorch - see Issues: [1](https://github.com/pytorch/pytorch/issues/2244), [2](https://github.com/pytorch/pytorch/issues/1355)) might crash if the shared memory size (`/dev/shm`) is too small. The default shared memory size of Docker is 64MB, which might not be enough for a few tools. You can provide a higher shared memory size via the `shm-size` docker run option:
```bash
-docker run --shm-size=2G mltooling/ml-workspace:latest
+docker run --shm-size=2G mltooling/ml-workspace:0.13.2
```
----
+
-
+Multiprocessing code is unexpectedly slow (click to expand...)
-## Contributors
+In general, the performance of running code within Docker is [nearly identical](https://stackoverflow.com/questions/21889053/what-is-the-runtime-performance-cost-of-a-docker-container) compared to running it directly on the machine. However, in case you have limited the container's CPU quota (as explained in [this section](#limit-memory--cpu)), the container can still see the full count of CPU cores available on the machine and there is no technical way to prevent this. Many libraries and tools will use the full CPU count (e.g., via `os.cpu_count()`) to set the number of threads used for multiprocessing/-threading. This might cause the program to start more threads/processes than it can efficiently handle with the available CPU quota, which can tremendously slow down the overall performance. Therefore, it is important to set the available CPU count or the maximum number of threads explicitly to the configured CPU quota. The workspace provides capabilities to detect the number of available CPUs automatically, which are used to configure a variety of common libraries via environment variables such as `OMP_NUM_THREADS` or `MKL_NUM_THREADS`. It is also possible to explicitly set the number of available CPUs at container startup via the `MAX_NUM_THREADS` environment variable (see [configuration section](https://github.com/ml-tooling/ml-workspace#configuration-options)). The same environment variable can also be used to get the number of available CPUs at runtime.
-[](https://sourcerer.io/fame/LukasMasuch/ml-tooling/ml-workspace/links/0)[](https://sourcerer.io/fame/LukasMasuch/ml-tooling/ml-workspace/links/1)[](https://sourcerer.io/fame/LukasMasuch/ml-tooling/ml-workspace/links/2)[](https://sourcerer.io/fame/LukasMasuch/ml-tooling/ml-workspace/links/3)[](https://sourcerer.io/fame/LukasMasuch/ml-tooling/ml-workspace/links/4)[](https://sourcerer.io/fame/LukasMasuch/ml-tooling/ml-workspace/links/5)[](https://sourcerer.io/fame/LukasMasuch/ml-tooling/ml-workspace/links/6)[](https://sourcerer.io/fame/LukasMasuch/ml-tooling/ml-workspace/links/7)
+Even though the automatic configuration capabilities of the workspace will fix a variety of inefficiencies, we still recommend configuring the number of available CPUs with all libraries explicitly. For example:
----
+```python
+import os
+MAX_NUM_THREADS = int(os.getenv("MAX_NUM_THREADS"))
-
+# Set in pytorch
+import torch
+torch.set_num_threads(MAX_NUM_THREADS)
-## Contribution
+# Set in tensorflow
+import tensorflow as tf
+config = tf.ConfigProto(
+ device_count={"CPU": MAX_NUM_THREADS},
+ inter_op_parallelism_threads=MAX_NUM_THREADS,
+ intra_op_parallelism_threads=MAX_NUM_THREADS,
+)
+tf_session = tf.Session(config=config)
-- Pull requests are encouraged and always welcome. Read [`CONTRIBUTING.md`](https://github.com/ml-tooling/ml-workspace/tree/master/CONTRIBUTING.md) and check out [help-wanted](https://github.com/ml-tooling/ml-workspace/issues?utf8=%E2%9C%93&q=is%3Aopen+is%3Aissue+label%3A"help+wanted"+sort%3Areactions-%2B1-desc+) issues.
-- Submit Github issues for any [feature enhancements](https://github.com/ml-tooling/ml-workspace/issues/new?assignees=&labels=feature-request&template=02_feature-request.md&title=), [bugs](https://github.com/ml-tooling/ml-workspace/issues/new?assignees=&labels=bug&template=01_bug-report.md&title=), or [documentation](https://github.com/ml-tooling/ml-workspace/issues/new?assignees=&labels=enhancement%2C+docs&template=03_documentation.md&title=) problems.
-- By participating in this project, you agree to abide by its [Code of Conduct](https://github.com/ml-tooling/ml-workspace/tree/master/CODE_OF_CONDUCT.md).
+# Set session for keras
+import keras.backend as K
+K.set_session(tf_session)
-
+# Set in sklearn estimator
+from sklearn.linear_model import LogisticRegression
+LogisticRegression(n_jobs=MAX_NUM_THREADS).fit(X, y)
+
+# Set for multiprocessing pool
+from multiprocessing import Pool
-Development instructions for contributors (click to expand...)
+with Pool(MAX_NUM_THREADS) as pool:
+ results = pool.map(lst)
+```
-### Build
+
-Execute this command in the project root folder to build the docker container:
+
-```bash
-python build.py --version={MAJOR.MINOR.PATCH-TAG}
+Nginx terminates with SIGILL core dumped error (click to expand...)
+
+If you encounter the following error within the container logs when starting the workspace, it will most likely not be possible to run the workspace on your hardware:
+
+```
+exited: nginx (terminated by SIGILL (core dumped); not expected)
```
-The version is optional and should follow the [Semantic Versioning](https://semver.org/) standard (MAJOR.MINOR.PATCH). For additional script options:
+The OpenResty/Nginx binary package used within the workspace requires to run on a CPU with `SSE4.2` support (see [this issue](https://github.com/openresty/openresty/issues/267#issuecomment-309296900)). Unfortunately, some older CPUs do not have support for `SSE4.2` and, therefore, will not be able to run the workspace container. On Linux, you can check if your CPU supports `SSE4.2` when looking into the `cat /proc/cpuinfo` flags section. If you encounter this problem, feel free to notify us by commenting on the following issue: [#30](https://github.com/ml-tooling/ml-workspace/issues/30).
+
+
+
+---
+
+
+
+## Contribution
+
+- Pull requests are encouraged and always welcome. Read our [contribution guidelines](https://github.com/ml-tooling/ml-workspace/tree/main/CONTRIBUTING.md) and check out [help-wanted](https://github.com/ml-tooling/ml-workspace/issues?utf8=%E2%9C%93&q=is%3Aopen+is%3Aissue+label%3A"help+wanted"+sort%3Areactions-%2B1-desc+) issues.
+- Submit Github issues for any [feature request and enhancement](https://github.com/ml-tooling/ml-workspace/issues/new?assignees=&labels=feature&template=02_feature-request.md&title=), [bugs](https://github.com/ml-tooling/ml-workspace/issues/new?assignees=&labels=bug&template=01_bug-report.md&title=), or [documentation](https://github.com/ml-tooling/ml-workspace/issues/new?assignees=&labels=documentation&template=03_documentation.md&title=) problems.
+- By participating in this project, you agree to abide by its [Code of Conduct](https://github.com/ml-tooling/ml-workspace/blob/main/.github/CODE_OF_CONDUCT.md).
+- The [development section](#development) below contains information on how to build and test the project after you have implemented some changes.
+
+## Development
+
+> _**Requirements**: [Docker](https://docs.docker.com/get-docker/) and [Act](https://github.com/nektos/act#installation) are required to be installed on your machine to execute the build process._
+
+To simplify the process of building this project from scratch, we provide build-scripts - based on [universal-build](https://github.com/ml-tooling/universal-build) - that run all necessary steps (build, test, and release) within a containerized environment. To build and test your changes, execute the following command in the project root folder:
```bash
-python build.py --help
+act -b -j build
```
-### Deploy
-
-Execute this command in the project root folder to push the container to the configured docker registry:
+Under the hood it uses the build.py files in this repo based on the [universal-build library](https://github.com/ml-tooling/universal-build). So, if you want to build it locally, you can also execute this command in the project root folder to build the docker container:
```bash
-python build.py --deploy --version={MAJOR.MINOR.PATCH-TAG}
+python build.py --make
```
-The version has to be provided. The version format should follow the [Semantic Versioning](https://semver.org/) standard (MAJOR.MINOR.PATCH). For additional script options:
+For additional script options:
```bash
python build.py --help
```
-
+Refer to our [contribution guides](https://github.com/ml-tooling/ml-workspace/blob/main/CONTRIBUTING.md#development-instructions) for more detailed information on our build scripts and development process.
---
-Licensed **Apache 2.0**. Created and maintained with ❤️ by developers from SAP in Berlin.
+Licensed **Apache 2.0**. Created and maintained with ❤️ by developers from Berlin.
diff --git a/SECURITY.md b/SECURITY.md
new file mode 100644
index 00000000..716f2430
--- /dev/null
+++ b/SECURITY.md
@@ -0,0 +1,18 @@
+# Security Policy
+
+## Supported Versions
+
+We release security patches for the following versions:
+
+| Version | Supported |
+| ------- | ------------------ |
+| >= 0.13.x | :white_check_mark: |
+| < 0.13 | :x: |
+
+## Reporting a Vulnerability
+
+If you wish to report a security vulnerability -- thank you! -- we ask that you follow the following process:
+
+**Please do not report security vulnerabilities through public GitHub issues or disclose the vulnerability publicly until a fix is released!**
+
+Please report (suspected) security vulnerabilities to **[security@mltooling.org](mailto:security@mltooling.org)**. Please add a description of the issue, the steps you took to create the issue, affected versions, and if known, mitigations or fixes for the issue. We will evaluate the vulnerability and, if necessary, release a fix or mitigating steps to address it. We will contact you to let you know the outcome, and will credit you in the report. Once we have either a) published a fix, or b) declined to address the vulnerability for whatever reason, you are free to publicly disclose it.
diff --git a/build.py b/build.py
index 42347562..ce839adf 100644
--- a/build.py
+++ b/build.py
@@ -1,121 +1,145 @@
-import os, sys
-import subprocess
import argparse
import datetime
+import subprocess
-parser = argparse.ArgumentParser()
-parser.add_argument('--name', help='name of docker container', default="ml-workspace")
-parser.add_argument('--version', help='version tag of docker container', default="latest")
-parser.add_argument('--deploy', help='deploy docker container to remote', action='store_true')
-parser.add_argument('--flavor', help='flavor (full, light, minimal) used for docker container', default='full')
+import docker
+from universal_build import build_utils
+from universal_build.helpers import build_docker
REMOTE_IMAGE_PREFIX = "mltooling/"
+COMPONENT_NAME = "ml-workspace"
+FLAG_FLAVOR = "flavor"
+
+parser = argparse.ArgumentParser(add_help=False)
+parser.add_argument(
+ "--" + FLAG_FLAVOR,
+ help="Flavor (full, light, minimal, gpu) used for docker container",
+ default="all",
+)
+
+args = build_utils.parse_arguments(argument_parser=parser)
+
+VERSION = str(args.get(build_utils.FLAG_VERSION))
+docker_image_prefix = args.get(build_docker.FLAG_DOCKER_IMAGE_PREFIX)
+
+if not docker_image_prefix:
+ docker_image_prefix = REMOTE_IMAGE_PREFIX
+
+if not args.get(FLAG_FLAVOR):
+ args[FLAG_FLAVOR] = "all"
+
+flavor = str(args[FLAG_FLAVOR]).lower().strip()
+
+if flavor == "all":
+ args[FLAG_FLAVOR] = "minimal"
+ build_utils.build(".", args)
+
+ args[FLAG_FLAVOR] = "light"
+ build_utils.build(".", args)
-args, unknown = parser.parse_known_args()
-if unknown:
- print("Unknown arguments "+str(unknown))
-
-# Wrapper to print out command
-def call(command):
- print("Executing: "+command)
- return subprocess.call(command, shell=True)
-
-# calls build scripts in every module with same flags
-def build(module="."):
-
- if not os.path.isdir(module):
- print("Could not find directory for " + module)
- sys.exit(1)
-
- build_command = "python build.py"
-
- if args.version:
- build_command += " --version=" + str(args.version)
-
- if args.deploy:
- build_command += " --deploy"
-
- if args.flavor:
- build_command += " --flavor=" + str(args.flavor)
-
- working_dir = os.path.dirname(os.path.realpath(__file__))
- full_command = "cd " + module + " && " + build_command + " && cd " + working_dir
- print("Building " + module + " with: " + full_command)
- failed = call(full_command)
- if failed:
- print("Failed to build module " + module)
- sys.exit(1)
-
-if not args.flavor:
- args.flavor = "full"
-
-args.flavor = str(args.flavor).lower()
-
-if args.flavor == "all":
- args.flavor = "full"
- build()
- args.flavor = "light"
- build()
- args.flavor = "minimal"
- build()
- args.flavor = "r"
- build()
- args.flavor = "spark"
- build()
- args.flavor = "gpu"
- build()
- sys.exit(0)
+ args[FLAG_FLAVOR] = "full"
+ build_utils.build(".", args)
+
+ args[FLAG_FLAVOR] = "gpu"
+ build_utils.build("gpu-flavor", args)
+
+ build_utils.exit_process(0)
# unknown flavor -> try to build from subdirectory
-if args.flavor not in ["full", "minimal", "light"]:
+if flavor not in ["full", "minimal", "light"]:
# assume that flavor has its own directory with build.py
- build(args.flavor + "-flavor")
- sys.exit(0)
-
-service_name = os.path.basename(os.path.dirname(os.path.realpath(__file__)))
-if args.name:
- service_name = args.name
+ build_utils.build(flavor + "-flavor", args)
+ build_utils.exit_process(0)
+docker_image_name = COMPONENT_NAME
# Build full image without suffix if the flavor is not minimal or light
-if args.flavor in ["minimal", "light"]:
- service_name += "-" + args.flavor
+if flavor in ["minimal", "light"]:
+ docker_image_name += "-" + flavor
# docker build
git_rev = "unknown"
try:
- git_rev = subprocess.check_output(["git", "rev-parse", "--short", "HEAD"]).decode('ascii').strip()
-except:
+ git_rev = (
+ subprocess.check_output(["git", "rev-parse", "--short", "HEAD"])
+ .decode("ascii")
+ .strip()
+ )
+except Exception:
pass
build_date = datetime.datetime.utcnow().isoformat("T") + "Z"
try:
- build_date = subprocess.check_output(['date', '-u', '+%Y-%m-%dT%H:%M:%SZ']).decode('ascii').strip()
-except:
+ build_date = (
+ subprocess.check_output(["date", "-u", "+%Y-%m-%dT%H:%M:%SZ"])
+ .decode("ascii")
+ .strip()
+ )
+except Exception:
pass
vcs_ref_build_arg = " --build-arg ARG_VCS_REF=" + str(git_rev)
build_date_build_arg = " --build-arg ARG_BUILD_DATE=" + str(build_date)
-flavor_build_arg = " --build-arg ARG_WORKSPACE_FLAVOR=" + str(args.flavor)
-version_build_arg = " --build-arg ARG_WORKSPACE_VERSION=" + str(args.version)
-
-versioned_image = service_name+":"+str(args.version)
-latest_image = service_name+":latest"
-failed = call("docker build -t "+ versioned_image + " -t " + latest_image + " "
- + version_build_arg + " " + flavor_build_arg+ " " + vcs_ref_build_arg + " " + build_date_build_arg + " ./")
-
-if failed:
- print("Failed to build container")
- sys.exit(1)
-
-remote_versioned_image = REMOTE_IMAGE_PREFIX + versioned_image
-call("docker tag " + versioned_image + " " + remote_versioned_image)
-
-remote_latest_image = REMOTE_IMAGE_PREFIX + latest_image
-call("docker tag " + latest_image + " " + remote_latest_image)
-
-if args.deploy:
- call("docker push " + remote_versioned_image)
-
- if "SNAPSHOT" not in args.version:
- # do not push SNAPSHOT builds as latest version
- call("docker push " + remote_latest_image)
+flavor_build_arg = " --build-arg ARG_WORKSPACE_FLAVOR=" + str(flavor)
+version_build_arg = " --build-arg ARG_WORKSPACE_VERSION=" + VERSION
+
+if args[build_utils.FLAG_MAKE]:
+ build_args = (
+ version_build_arg
+ + " "
+ + flavor_build_arg
+ + " "
+ + vcs_ref_build_arg
+ + " "
+ + build_date_build_arg
+ )
+
+ completed_process = build_docker.build_docker_image(
+ docker_image_name, version=VERSION, build_args=build_args
+ )
+ if completed_process.returncode > 0:
+ build_utils.exit_process(1)
+
+if args[build_utils.FLAG_TEST]:
+ workspace_name = f"workspace-test-{flavor}"
+ workspace_port = "8080"
+ client = docker.from_env()
+ container = client.containers.run(
+ f"{docker_image_name}:{VERSION}",
+ name=workspace_name,
+ environment={
+ "WORKSPACE_NAME": workspace_name,
+ "WORKSPACE_ACCESS_PORT": workspace_port,
+ },
+ detach=True,
+ )
+
+ container.reload()
+ container_ip = container.attrs["NetworkSettings"]["Networks"]["bridge"]["IPAddress"]
+
+ completed_process = build_utils.run(
+ f"docker exec --env WORKSPACE_IP={container_ip} {workspace_name} pytest '/resources/tests'",
+ exit_on_error=False,
+ )
+
+ container.remove(force=True)
+ if completed_process.returncode > 0:
+ build_utils.exit_process(1)
+
+
+if args[build_utils.FLAG_RELEASE]:
+ # Bump all versions in some filess
+ previous_version = build_utils.get_latest_version()
+ if previous_version:
+ build_utils.replace_in_files(
+ previous_version,
+ VERSION,
+ file_paths=["./README.md", "./deployment/google-cloud-run/Dockerfile"],
+ regex=False,
+ exit_on_error=True,
+ )
+
+ build_docker.release_docker_image(
+ docker_image_name,
+ VERSION,
+ docker_image_prefix,
+ )
diff --git a/docs/update-workspace-image.md b/docs/update-workspace-image.md
index e68b3e4f..045d4fc1 100644
--- a/docs/update-workspace-image.md
+++ b/docs/update-workspace-image.md
@@ -1,20 +1,23 @@
# Workspace Update Process
-We plan to do a full workspace image update (all libraries and tools) about every three month. The full update involves quiet a bit of manual work as documented below:
+We plan to do a full workspace image update (all libraries and tools) about every three months. The full update involves quiet a bit of manual work as documented below:
1. Refactor incubation zone:
+
- Move ubuntu packages to basics or gui-tools section.
- Move python libraries to requirement files in `resources/libraries`.
- Refactor other installs.
-2. Update core (proecss) tools and interpreters:
+2. Update core (process) tools and interpreters:
+
- Tini: [latest release](https://github.com/krallin/tini/releases/latest)
- OpenResty: [latest release](https://openresty.org/en/download.html)
- Miniconda: [latest release](https://repo.continuum.io/miniconda/), [python version](https://anaconda.org/conda-forge/python)
- Node.js: [latest release](https://nodejs.org/en/download/current/)
3. Update core (gui) tools:
- - TigetVNC: [latest release](https://dl.bintray.com/tigervnc/stable/)
+
+ - TigerVNC: [latest release](https://dl.bintray.com/tigervnc/stable/)
- noVNC: [latest release](https://github.com/novnc/noVNC/releases/latest)
- Websockify: [latest release](https://github.com/novnc/websockify/releases/latest)
- VS Code Server: [latest release](https://github.com/cdr/code-server/releases/latest)
@@ -22,6 +25,7 @@ We plan to do a full workspace image update (all libraries and tools) about ever
- FileBrowser: [latest release](https://github.com/filebrowser/filebrowser/releases/latest)
4. Update conda packages:
+
- Jupyter Notebook: [latest release](https://anaconda.org/search?q=notebook&sort=ndownloads&sort_order=1&reverse=true)
- JupyterLab: [latest release](https://anaconda.org/search?q=jupyterlab&sort=ndownloads&sort_order=1&reverse=true)
- IPython: [latest release](https://anaconda.org/search?q=ipython&sort=ndownloads&sort_order=1&reverse=true)
@@ -29,71 +33,85 @@ We plan to do a full workspace image update (all libraries and tools) about ever
- PyTorch: [latest release](https://anaconda.org/search?q=pytorch&sort=ndownloads&sort_order=1&reverse=true)
5. Update VS-code extensions:
+
- python: [latest release](https://github.com/microsoft/vscode-python/releases/latest)
- java: [latest release](https://github.com/redhat-developer/vscode-java/releases)
- - git-lens: [latest release](https://github.com/eamodio/vscode-gitlens/releases/latest)
+ - prettier: [latest release](https://github.com/prettier/prettier-vscode/releases/latest)
+ - jupyter: [latest release](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter)
- code-runner: [latest release](https://github.com/formulahendry/vscode-code-runner/releases/latest)
- eslint: [latest release](https://marketplace.visualstudio.com/items?itemName=dbaeumer.vscode-eslint)
- - markdownlint: [latest release](https://marketplace.visualstudio.com/items?itemName=DavidAnson.vscode-markdownlint)
- - remote-ssh: [latest release](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-ssh)
6. Update tool installer scripts:
+
- intellij.sh: [latest release](https://www.jetbrains.com/idea/download/other.html)
- pycharm.sh: [latest release](https://www.jetbrains.com/pycharm/download/other.html)
- nteract.sh: [latest release](https://github.com/nteract/nteract/releases/latest)
- - pillow-simd.sh: [latest release](https://pypi.org/project/Pillow-SIMD/#history)
- - rstudio-server.sh: [latest release](https://www.rstudio.com/products/rstudio/download-server/)
- - rstudio-desktop.sh: [latest release](https://www.rstudio.com/products/rstudio/download/#download)
+ - r-runtime.sh: [latest release](https://www.rstudio.com/products/rstudio/download-server/)
- sqlectron.sh: [latest release](https://github.com/sqlectron/sqlectron-gui/releases/latest)
- zeppelin.sh: [latest release](http://zeppelin.apache.org/download.html)
- robo3t.sh: [latest release](https://github.com/Studio3T/robomongo/releases/latest)
- metabase.sh: [latest release](https://github.com/metabase/metabase/releases/latest)
- fasttext.sh: [latest release](https://github.com/facebookresearch/fastText/releases/latest)
- - kubernetes-client.sh: [kube-prompt release](https://github.com/c-bata/kube-prompt/releases/latest)
- - guacamole.sh: [latest relesase](https://guacamole.apache.org/releases/)
+ - kubernetes-utils.sh: [kube-prompt release](https://github.com/c-bata/kube-prompt/releases/latest), [conftest release](https://github.com/open-policy-agent/conftest/releases), [yq release](https://github.com/mikefarah/yq/releases)
+ - portainer.sh: [latests release](https://github.com/portainer/portainer/releases/latest)
+ - rapids-gpu.sh: [latests release](https://rapids.ai/)
+
+7. Update `minimmal` and `light` flavor Python libraries:
-7. Update `minimmal` and `light` flavor python libraries:
- Update requirement files using [piprot](https://github.com/sesh/piprot), [pur](https://github.com/alanhamlett/pip-update-requirements), or [pip-upgrader](https://github.com/simion/pip-upgrader):
- `piprot ./resources/libraries/requirements-minimal.txt`
- `piprot ./resources/libraries/requirements-light.txt`
- [pur](https://github.com/alanhamlett/pip-update-requirements) example: `pur -i -r ./resources/libraries/requirements-minimal.txt`
8. Build and test `minimal` flavor:
- - Build minimal workspace flavor via `python build.py --flavor=minimal`
+
+ - Build minimal workspace flavor via `python build.py --make --flavor=minimal`
- Run workspace container and check startup logs
- Check/Compare layer sizes of new image with previous version (via Portainer)
- Check Image Labels (via Portainer)
- Check folder sizes via `Disk Usage Analyzer` within the Desktop VNC
- Check all webtools/features (just open and see of running):
- Jupyter, VNC, JupyterLab, VS-Code, Ungit, Netdata, Glances, Filebrowser, Access Port, SSH Access, Git Integration, Tensorboard
+ - Check if novnc settings are applied in settings menu: reconnect = True, scaling = remote, and correct websockify path
+ - Check if vs-code settings are applied: the settings file in vs-code should be filled with some settings
9. Build and test `light` flavor:
- - Build light workspace flavor via `python build.py --flavor=light`
- - Run workspace container and check startup logs
- - Check/Compare layer sizes of new image with previous version (via Portainer)
- - Check folder sizes via `Disk Usage Analyzer` within the Desktop VNC
- - Run `evaluate-python-libraries.ipynb` notebook to update `requirements-full.txt`
+
+ - Build light workspace flavor via `python build.py --make --flavor=light`
+ - Run workspace container and check startup logs
+ - Check/Compare layer sizes of new image with previous version (via Portainer)
+ - Check folder sizes via `Disk Usage Analyzer` within the Desktop VNC
+ - Run `/resources/tests/evaluate-py-libraries.ipynb` notebook to update `requirements-full.txt`
+ - Run `/resources/tests/test-tool-installers.ipynb` notebook to test installer scripts.
10. Build and test `full` flavor:
- - Build main workspace flavor via `python build.py --flavor=full`
+
+ - Build main workspace flavor via `python build.py --make --flavor=full`
- Deploy new workspace image and check startup logs
- Check/Compare layer sizes of new image with previous version (via Portainer)
- Check Image Labels (via Portainer)
- Check folder sizes via `Disk Usage Analyzer` within the Desktop VNC
- Check all webtools/features (just open and see of running):
- - Jupyter (+ Extensions), JupyterLab (+ Extensions), VNC, VS-Code (+ Extensions), Ungit, Netdata, Glances, Filebrowser, Access Port, SSH Access, Git Integration, Tensorboard
+ - Jupyter (+ Extensions), JupyterLab (+ Extensions), VNC, VS-Code (+ Extensions), Ungit, Netdata, Glances, Filebrowser, Access Port, SSH Access, Git Integration, Tensorboard
- Run from inside workspace: `/bin/bash /resources/tests/log-environment-info.sh`
- Run from inside workspace: `tutorials/workspace-test-utilities.ipynb`
- - Check all gui-tools in VNC Desktop (just open and see of running)
- - Run from inside workspace: `python /resources/tests/test-installers.py`
+ - Check all gui-tools in VNC Desktop (just open and see of running): VS Code, glogg, Chrome, Firefox, DB Browser, Task Manager
- Run from inside workspace: `/bin/bash /resources/tests/scan-python-vulnerabilities.sh`
- - Run from inside workspace: `/bin/bash /resources/tests/scan-clamav-virus.sh`
- - Run from inside workspace: `/bin/bash /resources/tests/scan-system-vulnerabilities.sh`
+ - Run from inside workspace (virus scan via [trivy](https://github.com/aquasecurity/trivy)): `/bin/bash /resources/tests/scan-trivy-vulnerabilities.sh`
+ - Run from inside workspace (virus scan via [clamav](https://www.clamav.net/)): `/bin/bash /resources/tests/scan-clamav-virus.sh`
- Run from inside workspace: `python /resources/tests/test-code-execution.py`
- - Update reports and licenses in git repo
+ - Update reports and licenses in Git repo
- Check if tutorials are still working in `/workspace/tutorials`
+ - Scan workspace image with [docker scan](https://docs.docker.com/engine/scan/): `docker scan --accept license --dependency-tree --file Dockerfile ml-workspace`. Fix or prevent high- or critical-severity vulnerabilities. Update report in `resources/reports/docker-snyk-scan.txt`.
+
+11. Update, build and test `gpu` flavor:
+
+ - Update CUDA Tooling based on [cuda container images](https://gitlab.com/nvidia/container-images/cuda/)
+ - Decide for CUDA version update based on tensorflow & pytorch support
+ - Update GPU libraries and tooling inside Dockerfile
+ - Build via `python build.py --flavor=gpu`
+ - Test `nvidia-smi` in terminal to check for GPU access
+ - Test image on GPU machine und run `/workspace/tutorials/workspace-test-utilities.ipynb`
+ - Test GPU interface in Netdata and Glances
-11. Build and test `gpu` flavor via `python build.py --flavor=gpu`
-12. Build and test `R` flavor via `python build.py --flavor=R`
-13. Build and test `spark` flavor via `python build.py --flavor=spark`
-14. Build and push all flavors via `python build.py --deploy --version= --flavor=all`
\ No newline at end of file
+12. Build and push all flavors via `python build.py --deploy --version= --flavor=all`
diff --git a/gpu-flavor/Dockerfile b/gpu-flavor/Dockerfile
index 8e0f085c..5da43b62 100644
--- a/gpu-flavor/Dockerfile
+++ b/gpu-flavor/Dockerfile
@@ -1,186 +1,183 @@
-ARG ARG_WORKSPACE_VERSION="latest"
+ARG ARG_WORKSPACE_BASE_IMAGE="mltooling/ml-workspace:latest"
# Build from full flavor of workspace with same version
-FROM mltooling/ml-workspace:$ARG_WORKSPACE_VERSION
+FROM $ARG_WORKSPACE_BASE_IMAGE
ARG ARG_WORKSPACE_FLAVOR="gpu"
ENV WORKSPACE_FLAVOR=$ARG_WORKSPACE_FLAVOR
-# argument needs to be initalized again
-ARG ARG_WORKSPACE_VERSION="latest"
-ENV WORKSPACE_VERSION=$ARG_WORKSPACE_VERSION
USER root
### NVIDIA CUDA BASE ###
-# https://gitlab.com/nvidia/container-images/cuda/blob/ubuntu18.04/10.1/base/Dockerfile
-RUN apt-get update && apt-get install -y --no-install-recommends gnupg2 curl ca-certificates && \
- curl -fsSL https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub | apt-key add - && \
- echo "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/cuda.list && \
- echo "deb https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/nvidia-ml.list && \
+# https://gitlab.com/nvidia/container-images/cuda/-/blob/master/dist/11.2.2/ubuntu20.04-x86_64/base/Dockerfile
+RUN apt-get update && apt-get install -y --no-install-recommends \
+ gnupg2 curl ca-certificates && \
+ curl -fsSL https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub | apt-key add - && \
+ echo "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 /" > /etc/apt/sources.list.d/cuda.list && \
+ echo "deb https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu2004/x86_64 /" > /etc/apt/sources.list.d/nvidia-ml.list && \
# Cleanup - cannot use cleanup script here, otherwise too much is removed
- apt-get clean && \
+ apt-get clean && \
rm -rf $HOME/.cache/* && \
rm -rf /tmp/* && \
rm -rf /var/lib/apt/lists/*
-ENV CUDA_VERSION 10.1.168
-ENV CUDA_PKG_VERSION 10-1=$CUDA_VERSION-1
+ENV CUDA_VERSION 11.2.2
+#ENV CUDA_PKG_VERSION 11-2=$CUDA_VERSION-1
+#ENV CUDART_VERSION 11-2=$CUDA_VERSION46-1
# For libraries in the cuda-compat-* package: https://docs.nvidia.com/cuda/eula/index.html#attachment-a
RUN apt-get update && apt-get install -y --no-install-recommends \
- cuda-cudart-$CUDA_PKG_VERSION \
- cuda-compat-10-1 && \
- ln -s cuda-10.1 /usr/local/cuda && \
+ cuda-cudart-11-2=11.2.152-1 \
+ cuda-compat-11-2 \
+ && ln -s cuda-11.2 /usr/local/cuda && \
rm -rf /var/lib/apt/lists/* && \
# Cleanup - cannot use cleanup script here, otherwise too much is removed
- apt-get clean && \
+ apt-get clean && \
rm -rf $HOME/.cache/* && \
rm -rf /tmp/* && \
rm -rf /var/lib/apt/lists/*
# Required for nvidia-docker v1
-RUN echo "/usr/local/nvidia/lib" >> /etc/ld.so.conf.d/nvidia.conf && \
- echo "/usr/local/nvidia/lib64" >> /etc/ld.so.conf.d/nvidia.conf
+RUN echo "/usr/local/nvidia/lib" >> /etc/ld.so.conf.d/nvidia.conf \
+ && echo "/usr/local/nvidia/lib64" >> /etc/ld.so.conf.d/nvidia.conf
ENV PATH /usr/local/nvidia/bin:/usr/local/cuda/bin:${PATH}
-ENV LD_LIBRARY_PATH /usr/local/nvidia/lib:/usr/local/nvidia/lib64:${LD_LIBRARY_PATH}
+ENV LD_LIBRARY_PATH /usr/local/nvidia/lib:/usr/local/nvidia/lib64
# nvidia-container-runtime
# https://github.com/NVIDIA/nvidia-container-runtime#environment-variables-oci-spec
+# nvidia-container-runtime
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility
-ENV NVIDIA_REQUIRE_CUDA "cuda>=10.1 brand=tesla,driver>=384,driver<385 brand=tesla,driver>=396,driver<397 brand=tesla,driver>=410,driver<411"
+ENV NVIDIA_REQUIRE_CUDA "cuda>=11.2 brand=tesla,driver>=418,driver<419 brand=tesla,driver>=440,driver<441 driver>=450"
### CUDA RUNTIME ###
-# https://gitlab.com/nvidia/container-images/cuda/blob/ubuntu18.04/10.1/runtime/Dockerfile
+# https://gitlab.com/nvidia/container-images/cuda/-/blob/master/dist/11.2.2/ubuntu20.04-x86_64/runtime/Dockerfile
-ENV NCCL_VERSION 2.4.2
+ENV NCCL_VERSION 2.8.4
RUN apt-get update && apt-get install -y --no-install-recommends \
- cuda-libraries-$CUDA_PKG_VERSION \
- cuda-nvtx-$CUDA_PKG_VERSION \
- libnccl2=$NCCL_VERSION-1+cuda10.1 && \
- apt-mark hold libnccl2 && \
- rm -rf /var/lib/apt/lists/* && \
+ cuda-libraries-11-2=11.2.2-1 \
+ libnpp-11-2=11.3.2.152-1 \
+ cuda-nvtx-11-2=11.2.152-1 \
+ libcublas-11-2=11.4.1.1043-1 \
+ libcusparse-11-2=11.4.1.1152-1 \
+ libnccl2=$NCCL_VERSION-1+cuda11.2 \
+ && rm -rf /var/lib/apt/lists/* \
# Cleanup - cannot use cleanup script here, otherwise too much is removed
- apt-get clean && \
- rm -rf $HOME/.cache/* && \
- rm -rf /tmp/* && \
- rm -rf /var/lib/apt/lists/*
+ && apt-get clean \
+ && rm -rf $HOME/.cache/* \
+ && rm -rf /tmp/* \
+ && rm -rf /var/lib/apt/lists/*
+
+RUN apt-mark hold libcublas-11-2 libnccl2
### END CUDA RUNTIME ###
### CUDA DEVEL ###
-# https://gitlab.com/nvidia/container-images/cuda/blob/ubuntu18.04/10.1/devel/Dockerfile
+# https://gitlab.com/nvidia/container-images/cuda/-/blob/master/dist/11.2.2/ubuntu20.04-x86_64/devel/Dockerfile
RUN apt-get update && apt-get install -y --no-install-recommends \
- cuda-libraries-dev-$CUDA_PKG_VERSION \
- cuda-nvml-dev-$CUDA_PKG_VERSION \
- cuda-minimal-build-$CUDA_PKG_VERSION \
- cuda-command-line-tools-$CUDA_PKG_VERSION \
- libnccl-dev=$NCCL_VERSION-1+cuda10.1 && \
- rm -rf /var/lib/apt/lists/* && \
+ libtinfo5 libncursesw5 \
+ cuda-cudart-dev-11-2=11.2.152-1 \
+ cuda-command-line-tools-11-2=11.2.2-1 \
+ cuda-minimal-build-11-2=11.2.2-1 \
+ cuda-libraries-dev-11-2=11.2.2-1 \
+ cuda-nvml-dev-11-2=11.2.152-1 \
+ libnpp-dev-11-2=11.3.2.152-1 \
+ libnccl-dev=2.8.4-1+cuda11.2 \
+ libcublas-dev-11-2=11.4.1.1043-1 \
+ libcusparse-dev-11-2=11.4.1.1152-1 && \
# Cleanup - cannot use cleanup script here, otherwise too much is removed
- apt-get clean && \
- rm -rf /root/.cache/* && \
+ apt-get clean && \
+ rm -rf $HOME/.cache/* && \
rm -rf /tmp/* && \
rm -rf /var/lib/apt/lists/*
+# apt from auto upgrading the cublas package. See https://gitlab.com/nvidia/container-images/cuda/-/issues/88
+RUN apt-mark hold libcublas-dev-11-2 libnccl-dev
ENV LIBRARY_PATH /usr/local/cuda/lib64/stubs
### END CUDA DEVEL ###
-### CUDANN7 DEVEL ###
-# https://gitlab.com/nvidia/container-images/cuda/blob/ubuntu18.04/10.1/devel/cudnn7/Dockerfile
+### CUDANN8 DEVEL ###
+# https://gitlab.com/nvidia/container-images/cuda/-/blob/master/dist/11.2.2/ubuntu20.04-x86_64/devel/cudnn8/Dockerfile
-ENV CUDNN_VERSION 7.6.0.64
+ENV CUDNN_VERSION 8.1.1.33
LABEL com.nvidia.cudnn.version="${CUDNN_VERSION}"
RUN apt-get update && apt-get install -y --no-install-recommends \
- libcudnn7=$CUDNN_VERSION-1+cuda10.1 \
- libcudnn7-dev=$CUDNN_VERSION-1+cuda10.1 && \
- apt-mark hold libcudnn7 && \
- rm -rf /var/lib/apt/lists/* && \
+ libcudnn8=$CUDNN_VERSION-1+cuda11.2 \
+ libcudnn8-dev=$CUDNN_VERSION-1+cuda11.2 \
+ && apt-mark hold libcudnn8 && \
# Cleanup
- apt-get clean && \
+ apt-get clean && \
rm -rf /root/.cache/* && \
rm -rf /tmp/* && \
rm -rf /var/lib/apt/lists/*
-### END CUDANN7 ###
+### END CUDANN8 ###
# Link Cupti:
ENV LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/usr/local/cuda/extras/CUPTI/lib64
-# Install TensorRT. Requires that libcudnn7 is installed above.
-# https://www.tensorflow.org/install/gpu#ubuntu_1804_cuda_101
-RUN apt-get update && apt-get install -y --no-install-recommends libnvinfer6=6.0.1-1+cuda10.1 \
- libnvinfer-dev=6.0.1-1+cuda10.1 \
- libnvinfer-plugin6=6.0.1-1+cuda10.1 && \
- # Cleanup
- clean-layer.sh
-
### GPU DATA SCIENCE LIBRARIES ###
RUN \
apt-get update && \
apt-get install -y libomp-dev libopenblas-base && \
- # Not needed? Install cuda-toolkit (e.g. for pytorch: https://pytorch.org/): https://anaconda.org/anaconda/cudatoolkit
- conda install -y cudatoolkit=10.1 -c pytorch && \
+ # Install pytorch gpu
+ # uninstall cpu only packages via conda
+ conda remove --force -y pytorch cpuonly && \
+ # https://pytorch.org/get-started/locally/
+ conda install cudatoolkit=11.2 -c pytorch -c nvidia && \
+ pip install --no-cache-dir torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html && \
# Install cupy: https://cupy.chainer.org/
- pip install --no-cache-dir cupy-cuda101 && \
+ pip install --no-cache-dir cupy-cuda112 && \
# Install pycuda: https://pypi.org/project/pycuda
pip install --no-cache-dir pycuda && \
# Install gpu utils libs
pip install --no-cache-dir gpustat py3nvml gputil && \
# Install scikit-cuda: https://scikit-cuda.readthedocs.io/en/latest/install.html
pip install --no-cache-dir scikit-cuda && \
- # Install tensorflow gpu - conda uninstall removes too much and conda remove corrupts environment
- # only tensorflow 2.1 supports cuda 10.1
- pip uninstall -y tensorflow && \
- pip install --no-cache-dir tensorflow-gpu==2.1.0 && \
+ # Install tensorflow gpu
+ pip uninstall -y tensorflow tensorflow-cpu intel-tensorflow && \
+ pip install --no-cache-dir tensorflow-gpu==2.5.0 && \
# Install ONNX GPU Runtime
pip uninstall -y onnxruntime && \
- pip install --no-cache-dir onnxruntime-gpu==1.1.1 && \
- # Install pytorch gpu
- # uninstall cpu only packages via conda
- conda remove --force -y pytorch torchvision cpuonly && \
- # https://pytorch.org/get-started/locally/
- conda install -y pytorch torchvision -c pytorch && \
+ pip install --no-cache-dir onnxruntime-gpu==1.8.0 onnxruntime-training==1.8.0 && \
+ # Install faiss gpu - TODO: to large?
+ # conda remove --force -y faiss-cpu && \
+ # conda install -y faiss-gpu -c pytorch && \
# Update mxnet to gpu edition
pip uninstall -y mxnet-mkl && \
- pip install --no-cache-dir mxnet-cu101mkl==1.5.1.post0 && \
+ # cuda111 -> >= 11.1
+ pip install --no-cache-dir mxnet-cu112 && \
# install jax: https://github.com/google/jax#pip-installation
- pip install --no-cache-dir --upgrade jax https://storage.googleapis.com/jax-releases/cuda101/jaxlib-0.1.37-cp37-none-linux_x86_64.whl && \
+ pip install --upgrade jax[cuda111] -f https://storage.googleapis.com/jax-releases/jax_releases.html && \
# Install pygpu - Required for theano: http://deeplearning.net/software/libgpuarray/
conda install -y pygpu && \
+ # Install lightgbm
+ pip uninstall -y lightgbm && \
+ pip install lightgbm --install-option=--gpu --install-option="--opencl-include-dir=/usr/local/cuda/include/" --install-option="--opencl-library=/usr/local/cuda/lib64/libOpenCL.so" && \
# nvidia python ml lib
- pip install --upgrade --force-reinstall nvidia-ml-py3 && \
+ pip install --upgrade --force-reinstall nvidia-ml-py3 && \
# SpeedTorch: https://github.com/Santosh-Gupta/SpeedTorch
- pip install --no-cache-dir SpeedTorch && \
- # TODO: Install blazingsql
- # Install Jupyterlab GPU Plugin: https://github.com/jacobtomlinson/jupyterlab-nvdashboard - TODO deactivate jupyter plugin
- # pip install --no-cache-dir jupyterlab-nvdashboard && \
- # jupyter labextension install jupyterlab-nvdashboard && \
+ pip install --no-cache-dir SpeedTorch && \
+ # Ipyexperiments - fix memory leaks
+ pip install --no-cache-dir ipyexperiments && \
# Cleanup
- # Cleanup python bytecode files - not needed: https://jcrist.github.io/conda-docker-tips.html
- find ${CONDA_DIR} -type f -name '*.pyc' -delete && \
- find ${CONDA_DIR} -type l -name '*.pyc' -delete && \
clean-layer.sh
-# Install Rapids: https://rapids.ai/start.html#conda-install
-# conda install -c rapidsai -c nvidia -c conda-forge -c defaults \
-# rapids=0.11 \
-# python=3.7 \
-# cudatoolkit=10.1 && \
-# Install graphvite graph embedding lib: https://github.com/DeepGraphLearning/graphvite
-# conda install -c milagraph graphvite cudatoolkit=10.1 && \
-
-# Install Nvidia Apex
-# RUN cd $RESOURCES_PATH && \
-# git clone https://github.com/NVIDIA/apex && \
-# cd apex && \
-# pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./ && \
- # Cleanup
-# clean-layer.sh
+# TODO install DALI: https://docs.nvidia.com/deeplearning/dali/user-guide/docs/installation.html#dali-and-ngc
+# TODO: if > Ubuntu 19.04 -> install nvtop: https://github.com/Syllo/nvtop
+# TODO: Install Arrrayfire: https://arrayfire.com/download/ pip install --no-cache-dir arrayfire && \
+# TODO Nvidia Apex: https://github.com/NVIDIA/apex
+
+# cd $RESOURCES_PATH && \
+# git clone https://github.com/NVIDIA/apex && \
+# cd apex && \
+# # Surpress output - if there is a problem remove to see logs &> /dev/null
+# pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./ && \
+# rm -rf apex && \
# https://www.anaconda.com/getting-started-with-gpu-computing-in-anaconda/
@@ -194,28 +191,15 @@ ENV TF_FORCE_GPU_ALLOW_GROWTH true
### GPU TOOLS ###
-# Install Glances & Netdata GPU Support
-RUN \
- apt-get update -y && \
- apt-get install lm-sensors -y && \
- apt-get install netcat -y && \
- apt-get install iproute2 -y && \
- apt-get clean && \
- rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
- git clone https://github.com/Splo0sh/netdata_nv_plugin --depth 1 /tmp/netdata_nv_plugin && \
- cp /tmp/netdata_nv_plugin/nv.chart.py /usr/libexec/netdata/python.d/ && \
- cp /tmp/netdata_nv_plugin/python_modules/pynvml.py /usr/libexec/netdata/python.d/python_modules/ && \
- cp /tmp/netdata_nv_plugin/nv.conf /etc/netdata/python.d/ && \
- # Cleanup
- clean-layer.sh
-
### END GPU TOOLS ###
### CONFIGURATION ###
-# TODO what does this line do?
-RUN \
- echo 'Defaults env_keep += "ftp_proxy http_proxy https_proxy no_proxy"' >> /etc/sudoers
+#TODO: tests are currently empty COPY resources/tests/ /resources/tests
+
+# argument needs to be initalized again
+ARG ARG_WORKSPACE_VERSION="latest"
+ENV WORKSPACE_VERSION=$ARG_WORKSPACE_VERSION
# Overwrite & add Labels
ARG ARG_BUILD_DATE="unknown"
@@ -224,17 +208,11 @@ ARG ARG_VCS_REF="unknown"
LABEL \
"workspace.version"=$WORKSPACE_VERSION \
"workspace.flavor"=$WORKSPACE_FLAVOR \
+ "workspace.baseimage"=$ARG_WORKSPACE_BASE_IMAGE \
"org.opencontainers.image.version"=$WORKSPACE_VERSION \
"org.opencontainers.image.revision"=$ARG_VCS_REF \
- "org.opencontainers.image.created"=$ARG_BUILD_DATE \
+ "org.opencontainers.image.created"=$ARG_BUILD_DATE \
"org.label-schema.version"=$WORKSPACE_VERSION \
"org.label-schema.vcs-ref"=$ARG_VCS_REF \
"org.label-schema.build-date"=$ARG_BUILD_DATE
-# TODO use temp as data environment to use temp folder?
-# DATA_ENVIRONMENT="temp"
-
-# USER $NB_USER
-
-#RUN \
-# echo "export PATH=$PATH" >> $HOME/.bashrc
diff --git a/gpu-flavor/README.md b/gpu-flavor/README.md
index ceb20cc5..1874649e 100644
--- a/gpu-flavor/README.md
+++ b/gpu-flavor/README.md
@@ -1,6 +1,6 @@
- [R-flavor] All-in-one web-based development environment for machine learning
-
-
-
-
-
-
-
-
-
-
-
-Please visit our [Github repository](https://github.com/ml-tooling/ml-workspace#r-flavor) for documentation and deployment information.
-
----
-
-Licensed **Apache 2.0**. Created and maintained with ❤️ by developers from SAP in Berlin.
\ No newline at end of file
diff --git a/r-flavor/build.py b/r-flavor/build.py
deleted file mode 100644
index 94324941..00000000
--- a/r-flavor/build.py
+++ /dev/null
@@ -1,110 +0,0 @@
-import os, sys
-import subprocess
-import argparse
-import datetime
-
-parser = argparse.ArgumentParser()
-parser.add_argument('--name', help='base name of docker container', default="ml-workspace")
-parser.add_argument('--version', help='version tag of docker container', default="latest")
-parser.add_argument('--deploy', help='deploy docker container to remote', action='store_true')
-parser.add_argument('--flavor', help='flavor (r) used for docker container', default='r')
-
-REMOTE_IMAGE_PREFIX = "mltooling/"
-
-args, unknown = parser.parse_known_args()
-if unknown:
- print("Unknown arguments "+str(unknown))
-
-# Wrapper to print out command
-def call(command):
- print("Executing: "+command)
- return subprocess.call(command, shell=True)
-
-# calls build scripts in every module with same flags
-def build(module="."):
-
- if not os.path.isdir(module):
- print("Could not find directory for " + module)
- sys.exit(1)
-
- build_command = "python build.py"
-
- if args.version:
- build_command += " --version=" + str(args.version)
-
- if args.deploy:
- build_command += " --deploy"
-
- if args.flavor:
- build_command += " --flavor=" + str(args.flavor)
-
- working_dir = os.path.dirname(os.path.realpath(__file__))
- full_command = "cd " + module + " && " + build_command + " && cd " + working_dir
- print("Building " + module + " with: " + full_command)
- failed = call(full_command)
- if failed:
- print("Failed to build module " + module)
- sys.exit(1)
-
-if not args.flavor:
- args.flavor = "r"
-
-args.flavor = str(args.flavor).lower()
-
-if args.flavor == "all":
- args.flavor = "r"
- build()
- sys.exit(0)
-
-# unknown flavor -> try to build from subdirectory
-if args.flavor not in ["r"]:
- # assume that flavor has its own directory with build.py
- build(args.flavor + "-flavor")
- sys.exit(0)
-
-service_name = os.path.basename(os.path.dirname(os.path.realpath(__file__)))
-if args.name:
- service_name = args.name
-
-# add flavor to service name
-service_name += "-" + args.flavor
-
-# docker build
-git_rev = "unknown"
-try:
- git_rev = subprocess.check_output(["git", "rev-parse", "--short", "HEAD"]).decode('ascii').strip()
-except:
- pass
-
-build_date = datetime.datetime.utcnow().isoformat("T") + "Z"
-try:
- build_date = subprocess.check_output(['date', '-u', '+%Y-%m-%dT%H:%M:%SZ']).decode('ascii').strip()
-except:
- pass
-
-vcs_ref_build_arg = " --build-arg ARG_VCS_REF=" + str(git_rev)
-build_date_build_arg = " --build-arg ARG_BUILD_DATE=" + str(build_date)
-flavor_build_arg = " --build-arg ARG_WORKSPACE_FLAVOR=" + str(args.flavor)
-version_build_arg = " --build-arg ARG_WORKSPACE_VERSION=" + str(args.version)
-
-versioned_image = service_name+":"+str(args.version)
-latest_image = service_name+":latest"
-failed = call("docker build -t "+ versioned_image + " -t " + latest_image + " "
- + version_build_arg + " " + flavor_build_arg+ " " + vcs_ref_build_arg + " " + build_date_build_arg + " ./")
-
-if failed:
- print("Failed to build container")
- sys.exit(1)
-
-remote_versioned_image = REMOTE_IMAGE_PREFIX + versioned_image
-call("docker tag " + versioned_image + " " + remote_versioned_image)
-
-remote_latest_image = REMOTE_IMAGE_PREFIX + latest_image
-call("docker tag " + latest_image + " " + remote_latest_image)
-
-if args.deploy:
- call("docker push " + remote_versioned_image)
-
- if "SNAPSHOT" not in args.version:
- # do not push SNAPSHOT builds as latest version
- call("docker push " + remote_latest_image)
diff --git a/r-flavor/resources/rstudio-service.conf b/r-flavor/resources/rstudio-service.conf
deleted file mode 100644
index ca817e42..00000000
--- a/r-flavor/resources/rstudio-service.conf
+++ /dev/null
@@ -1,10 +0,0 @@
-[program:rstudio]
-command=/usr/lib/rstudio-server/bin/rserver --server-working-dir=%(ENV_WORKSPACE_HOME)s --server-daemonize=0 --auth-none=1 --auth-validate-users=0 --www-port=8071
-# user needs to be differnt from root and ld_library_path empty -> otherwise the project is stuck
-environment=USER="rstudio", LD_LIBRARY_PATH="", LD_PRELOAD=""
-redirect_stderr=true
-stdout_logfile=/var/log/supervisor/%(program_name)s.log ; log logs into file
-autostart=true ; start at supervisord start (default: true)
-autorestart=true ; whether/when to restart (default: unexpected)
-startretries=5 ; max
-
diff --git a/resources/docker-entrypoint.py b/resources/docker-entrypoint.py
index 01a650d6..c63d4d97 100644
--- a/resources/docker-entrypoint.py
+++ b/resources/docker-entrypoint.py
@@ -4,47 +4,47 @@
Main Workspace Run Script
"""
-from subprocess import call
-import os
+# Enable logging
+import logging
import math
+import os
import sys
+from subprocess import call
from urllib.parse import quote
-# Enable logging
-import logging
logging.basicConfig(
- format='%(asctime)s [%(levelname)s] %(message)s',
- level=logging.INFO,
- stream=sys.stdout)
+ format="%(asctime)s [%(levelname)s] %(message)s",
+ level=logging.INFO,
+ stream=sys.stdout,
+)
log = logging.getLogger(__name__)
log.info("Starting...")
+
def set_env_variable(env_variable: str, value: str, ignore_if_set: bool = False):
if ignore_if_set and os.getenv(env_variable, None):
# if it is already set, do not set it to the new value
return
# TODO is export needed as well?
- call('export ' + env_variable + '="' + value + '"', shell=True)
+ call("export " + env_variable + '="' + value + '"', shell=True)
os.environ[env_variable] = value
+
# Manage base path dynamically
-ENV_JUPYTERHUB_SERVICE_PREFIX = os.getenv("JUPYTERHUB_SERVICE_PREFIX")
+ENV_JUPYTERHUB_SERVICE_PREFIX = os.getenv("JUPYTERHUB_SERVICE_PREFIX", None)
ENV_NAME_WORKSPACE_BASE_URL = "WORKSPACE_BASE_URL"
-base_url = os.environ[ENV_NAME_WORKSPACE_BASE_URL]
-
-if not base_url:
- base_url = ""
+base_url = os.getenv(ENV_NAME_WORKSPACE_BASE_URL, "")
if ENV_JUPYTERHUB_SERVICE_PREFIX:
# Installation with Jupyterhub
-
+
# Base Url is not needed, Service prefix contains full path
# ENV_JUPYTERHUB_BASE_URL = os.getenv("JUPYTERHUB_BASE_URL")
- # ENV_JUPYTERHUB_BASE_URL.rstrip('/') +
+ # ENV_JUPYTERHUB_BASE_URL.rstrip('/') +
base_url = ENV_JUPYTERHUB_SERVICE_PREFIX
# Add leading slash
@@ -52,7 +52,7 @@ def set_env_variable(env_variable: str, value: str, ignore_if_set: bool = False)
base_url = "/" + base_url
# Remove trailing slash
-base_url = base_url.rstrip('/').strip()
+base_url = base_url.rstrip("/").strip()
# always quote base url
base_url = quote(base_url, safe="/%")
@@ -66,44 +66,87 @@ def set_env_variable(env_variable: str, value: str, ignore_if_set: bool = False)
ENV_MAX_NUM_THREADS = str(math.ceil(os.cpu_count()))
try:
# read out docker information - if docker limits cpu quota
- cpu_count = math.ceil(int(os.popen('cat /sys/fs/cgroup/cpu/cpu.cfs_quota_us').read().replace('\n', '')) / 100000)
+ cpu_count = math.ceil(
+ int(
+ os.popen("cat /sys/fs/cgroup/cpu/cpu.cfs_quota_us")
+ .read()
+ .replace("\n", "")
+ )
+ / 100000
+ )
if cpu_count > 0 and cpu_count < os.cpu_count():
ENV_MAX_NUM_THREADS = str(cpu_count)
except:
pass
- if not ENV_MAX_NUM_THREADS or not ENV_MAX_NUM_THREADS.isnumeric() or ENV_MAX_NUM_THREADS == "0":
+ if (
+ not ENV_MAX_NUM_THREADS
+ or not ENV_MAX_NUM_THREADS.isnumeric()
+ or ENV_MAX_NUM_THREADS == "0"
+ ):
ENV_MAX_NUM_THREADS = "4"
-
+
if int(ENV_MAX_NUM_THREADS) > 8:
# there should be atleast one thread less compared to cores
- ENV_MAX_NUM_THREADS = str(int(ENV_MAX_NUM_THREADS)-1)
-
+ ENV_MAX_NUM_THREADS = str(int(ENV_MAX_NUM_THREADS) - 1)
+
# set a maximum of 32, in most cases too many threads are adding too much overhead
if int(ENV_MAX_NUM_THREADS) > 32:
ENV_MAX_NUM_THREADS = "32"
-
+
# only set if it is not None or empty
- # OMP_NUM_THREADS: Suggested value: vCPUs / 2 in which vCPUs is the number of virtual CPUs.
- set_env_variable("OMP_NUM_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True) # OpenMP
- set_env_variable("OPENBLAS_NUM_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True) # OpenBLAS
- set_env_variable("MKL_NUM_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True) # MKL
- set_env_variable("VECLIB_MAXIMUM_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True) # Accelerate
- set_env_variable("NUMEXPR_NUM_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True) # Numexpr
- set_env_variable("NUMEXPR_MAX_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True) # Numexpr - maximum
- set_env_variable("NUMBA_NUM_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True) # Numba
- set_env_variable("SPARK_WORKER_CORES", ENV_MAX_NUM_THREADS, ignore_if_set=True) # Spark Worker
- # TBB_NUM_THREADS
+ # OMP_NUM_THREADS: Suggested value: vCPUs / 2 in which vCPUs is the number of virtual CPUs.
+ set_env_variable(
+ "OMP_NUM_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True
+ ) # OpenMP
+ set_env_variable(
+ "OPENBLAS_NUM_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True
+ ) # OpenBLAS
+ set_env_variable("MKL_NUM_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True) # MKL
+ set_env_variable(
+ "VECLIB_MAXIMUM_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True
+ ) # Accelerate
+ set_env_variable(
+ "NUMEXPR_NUM_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True
+ ) # Numexpr
+ set_env_variable(
+ "NUMEXPR_MAX_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True
+ ) # Numexpr - maximum
+ set_env_variable(
+ "NUMBA_NUM_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True
+ ) # Numba
+ set_env_variable(
+ "SPARK_WORKER_CORES", ENV_MAX_NUM_THREADS, ignore_if_set=True
+ ) # Spark Worker
+ set_env_variable(
+ "BLIS_NUM_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True
+ ) # Blis
+ set_env_variable("TBB_NUM_THREADS", ENV_MAX_NUM_THREADS, ignore_if_set=True) # TBB
# GOTO_NUM_THREADS
ENV_RESOURCES_PATH = os.getenv("RESOURCES_PATH", "/resources")
-ENV_WORKSPACE_HOME = os.getenv('WORKSPACE_HOME', "/workspace")
+ENV_WORKSPACE_HOME = os.getenv("WORKSPACE_HOME", "/workspace")
# pass all script arguments to next script
-script_arguments = " " + ' '.join(sys.argv[1:])
+script_arguments = " " + " ".join(sys.argv[1:])
-EXECUTE_CODE = os.getenv('EXECUTE_CODE', None)
+EXECUTE_CODE = os.getenv("EXECUTE_CODE", None)
if EXECUTE_CODE:
# use workspace as working directory
- sys.exit(call("cd " + ENV_WORKSPACE_HOME + " && python " + ENV_RESOURCES_PATH + "/scripts/execute_code.py" + script_arguments, shell=True))
-
-sys.exit(call("python " + ENV_RESOURCES_PATH + "/scripts/run_workspace.py" + script_arguments, shell=True))
\ No newline at end of file
+ sys.exit(
+ call(
+ "cd "
+ + ENV_WORKSPACE_HOME
+ + " && python "
+ + ENV_RESOURCES_PATH
+ + "/scripts/execute_code.py"
+ + script_arguments,
+ shell=True,
+ )
+ )
+
+sys.exit(
+ call(
+ "python " + ENV_RESOURCES_PATH + "/scripts/run_workspace.py" + script_arguments,
+ shell=True,
+ )
+)
diff --git a/resources/home/.config/Code/User/settings.json b/resources/home/.config/Code/User/settings.json
index 830253bd..e5457a1f 100644
--- a/resources/home/.config/Code/User/settings.json
+++ b/resources/home/.config/Code/User/settings.json
@@ -1,28 +1,34 @@
{
- "update.mode": "none",
- "update.showReleaseNotes": false,
- "extensions.autoUpdate": false,
- "extensions.autoCheckUpdates": false,
- "window.menuBarVisibility": "visible",
- "python.autoComplete.addBrackets": true,
- "python.formatting.provider": "black",
- "python.analysis.memory.keepLibraryAst": true,
- "python.analysis.memory.keepLibraryLocalVariables": true,
- "python.analysis.cachingLevel": "Library",
- "python.autoUpdateLanguageServer": false,
- "python.dataScience.sendSelectionToInteractiveWindow": true,
- "python.linting.pycodestyleArgs": [
- "--ignore=E203,E501,W503"
- ],
- "terminal.integrated.inheritEnv": false,
- "python.linting.pylintEnabled": false,
- "python.linting.flake8Enabled": true,
- "python.linting.flake8CategorySeverity.E": "Warning",
- "python.linting.flake8CategorySeverity.W": "Information",
- "python.linting.flake8CategorySeverity.F": "Warning",
- "python.linting.flake8Args": [
- "--ignore=E203,E501,W503"
- ],
- "gitlens.showWhatsNewAfterUpgrades": false,
- "gitlens.advanced.telemetry.enabled": false
-}
\ No newline at end of file
+ "update.mode": "none",
+ "update.showReleaseNotes": false,
+ "terminal.integrated.inheritEnv": false,
+ "extensions.autoUpdate": false,
+ "extensions.autoCheckUpdates": false,
+ "window.menuBarVisibility": "visible",
+ "python.autoComplete.addBrackets": true,
+ "python.formatting.provider": "black",
+ "python.analysis.memory.keepLibraryAst": true,
+ "python.autoUpdateLanguageServer": false,
+ "python.linting.enabled": true,
+ "python.linting.lintOnSave": true,
+ "python.linting.pylintEnabled": false,
+ "python.linting.flake8Enabled": true,
+ "python.linting.mypyEnabled": true,
+ "python.linting.flake8CategorySeverity.E": "Warning",
+ "python.linting.flake8CategorySeverity.W": "Information",
+ "python.linting.flake8CategorySeverity.F": "Warning",
+ "python.linting.flake8Args": [
+ "--ignore=E203,E501,W503"
+ ],
+ "python.sortImports.args": [
+ // black compatible attributes
+ "--multi-line=3",
+ "--trailing-comma",
+ "--force-grid-wrap=0",
+ "--use-parentheses",
+ "--line-width=88"
+ ],
+ "python.testing.pytestEnabled": true,
+ "workbench.colorTheme": "Default Dark+",
+ "python.pythonPath": "/opt/conda/bin/python"
+}
diff --git a/resources/home/.config/flake8 b/resources/home/.config/flake8
new file mode 100644
index 00000000..692adb82
--- /dev/null
+++ b/resources/home/.config/flake8
@@ -0,0 +1,2 @@
+[flake8]
+ignore = E203,E501,W503
diff --git a/resources/home/.config/mimeapps.list b/resources/home/.config/mimeapps.list
index 2108328a..21033580 100644
--- a/resources/home/.config/mimeapps.list
+++ b/resources/home/.config/mimeapps.list
@@ -1,3 +1,7 @@
[Default Applications]
-application/x-shellscript=exo-terminal-emulator.desktop
-text/plain=code.desktop;
\ No newline at end of file
+application/x-shellscript=xfce4-terminal-emulator.desktop
+text/plain=code.desktop;
+
+[Added Associations]
+application/x-shellscript=xfce4-terminal-emulator.desktop;
+text/plain=code.desktop;
diff --git a/resources/home/.config/xfce4/xfconf/xfce-perchannel-xml/xfce4-desktop.xml b/resources/home/.config/xfce4/xfconf/xfce-perchannel-xml/xfce4-desktop.xml
index b5e47e0a..37b37597 100755
--- a/resources/home/.config/xfce4/xfconf/xfce-perchannel-xml/xfce4-desktop.xml
+++ b/resources/home/.config/xfce4/xfconf/xfce-perchannel-xml/xfce4-desktop.xml
@@ -34,10 +34,42 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/resources/home/.config/xfce4/xfconf/xfce-perchannel-xml/xfce4-panel.xml b/resources/home/.config/xfce4/xfconf/xfce-perchannel-xml/xfce4-panel.xml
index 3b832e68..7f61cd15 100755
--- a/resources/home/.config/xfce4/xfconf/xfce-perchannel-xml/xfce4-panel.xml
+++ b/resources/home/.config/xfce4/xfconf/xfce-perchannel-xml/xfce4-panel.xml
@@ -15,22 +15,32 @@
-
+
+
+
-
+
+
+
+
+
+
+
-
+
+
+
@@ -50,4 +60,4 @@
-
\ No newline at end of file
+
diff --git a/resources/home/.config/xfce4/xfconf/xfce-perchannel-xml/xsettings.xml b/resources/home/.config/xfce4/xfconf/xfce-perchannel-xml/xsettings.xml
index b243d22f..6126a444 100755
--- a/resources/home/.config/xfce4/xfconf/xfce-perchannel-xml/xsettings.xml
+++ b/resources/home/.config/xfce4/xfconf/xfce-perchannel-xml/xsettings.xml
@@ -3,7 +3,7 @@
-
+
diff --git a/resources/home/.workspace/tools/05-terminal.json b/resources/home/.workspace/tools/05-terminal.json
index b3e43ec1..5bb81ea0 100644
--- a/resources/home/.workspace/tools/05-terminal.json
+++ b/resources/home/.workspace/tools/05-terminal.json
@@ -1,6 +1,6 @@
{
"id": "terminal-link",
"name": "Terminal",
- "url_path": "/terminals/main",
+ "url_path": "/terminals/new/main",
"description": "Open a command-line interface"
}
\ No newline at end of file
diff --git a/resources/jupyter/extensions/tooling-extension/jupyter_tooling/open-tools-widget.js b/resources/jupyter/extensions/tooling-extension/jupyter_tooling/open-tools-widget.js
index e57d742a..ee8f48dd 100644
--- a/resources/jupyter/extensions/tooling-extension/jupyter_tooling/open-tools-widget.js
+++ b/resources/jupyter/extensions/tooling-extension/jupyter_tooling/open-tools-widget.js
@@ -1,194 +1,271 @@
-define(['base/js/namespace', 'jquery', 'base/js/dialog', 'base/js/utils', 'require', './tooling-shared-components'], function (Jupyter, $, dialog, utils, require, sharedComponents) {
+define([
+ "base/js/namespace",
+ "jquery",
+ "base/js/dialog",
+ "base/js/utils",
+ "require",
+ "./tooling-shared-components",
+], function (Jupyter, $, dialog, utils, require, sharedComponents) {
+ // -------- GLOBAL VARIABLES -----------------------
+ var basePathRegex = "^(.*?)/(tree|notebooks/|edit/|terminals/)";
+ var basePath =
+ window.location.pathname.match(basePathRegex) == null
+ ? ""
+ : window.location.pathname.match(basePathRegex)[1] + "/";
+ if (!basePath) {
+ basePath = "/";
+ }
- // -------- GLOBAL VARIABLES -----------------------
- var basePathRegex = "^(\/.+?)\/(tree|notebooks|edit|terminals)";
- var basePath = (window.location.pathname.match(basePathRegex) == null) ? "" : (window.location.pathname.match(basePathRegex)[1] + '/');
- if (!basePath) {
- basePath = "/"
- }
+ var dir = window.document.body.dataset.notebookPath;
+ var dirname = "/" + dir;
- var dir = window.document.body.dataset.notebookPath;
- var dirname = '/' + dir
+ // ----------- HANDLER -------------------------------
- // ----------- HANDLER -------------------------------
+ var components = require("./tooling-shared-components");
+ var components = new sharedComponents();
- var components = require('./tooling-shared-components');
- var components = new sharedComponents();
+ function accessPortDialog() {
+ var div = $("");
+ var form = $("");
+ div.append(
+ ' '
+ );
+ div.append(
+ ' '
+ );
+ form.appendTo(div);
+ return div;
+ }
- function accessPortDialog() {
- var div = $('');
- var form = $('');
- div.append(' ')
- div.append(' ')
- form.appendTo(div);
- return div;
- }
+ function sshInstructionDialog(setup_command) {
+ // Please check our documentation on information on what you can do with the workspace
+ var div = $("");
+ div.append(
+ '
SSH provides a powerful set of features that enables you to be more productive with your development tasks as documented here. To setup a secure and passwordless SSH connection to this workspace, please execute:
During the setup process, you will be asked for input, e.g to provide a name for the connection. You can also download, copy and run this script to any other machine to setup an SSH connection to this workspace. This scripts only runs on Mac and Linux, Windows is currently not supported.
'
+ );
+ return div;
+ }
+
+ function installToolDialog(installers) {
+ // Please check our documentation on information on what you can do with the workspace
+ var div = $("");
+ div.append(
+ "
The workspace contains a collection of installer scripts for many commonly used development tools or libraries.
"
+ );
+ div.append(
+ "
1. Please select a tool for further installation instructions:
"
+ );
+ div.append(" ");
- function sshInstructionDialog(setup_command) {
- // Please check our documentation on information on what you can do with the workspace
- var div = $('');
- div.append('
SSH provides a powerful set of features that enables you to be more productive with your development tasks as documented here. To setup a secure and passwordless SSH connection to this workspace, please execute:
During the setup process, you will be asked for input, e.g to provide a name for the connection. You can also download, copy and run this script to any other machine to setup an SSH connection to this workspace. This scripts only runs on Mac and Linux, Windows is currently not supported.
');
- return div
+ installer_options =
+ "";
+ for (var i in installers) {
+ var installer = installers[i];
+ installer_options +=
+ '";
}
- function installToolDialog(installers) {
- // Please check our documentation on information on what you can do with the workspace
- var div = $('');
- div.append('
The workspace contains a collection of installer scripts for many commonly used development tools or libraries.
');
- div.append('
1. Please select a tool for further installation instructions:
');
- div.append(' ');
-
- installer_options = ''
- for (var i in installers) {
- var installer = installers[i];
- installer_options += ''
- }
+ div.append(
+ '"
+ );
+ div.append(" ");
+ div.append(
+ "
2. Run the following command within a workspace terminal to install and start the tool:
You have exceeded the limit of available disk storage assigned to your /workspace folder (your working directory). Please delete unnecessary files and folders from the /workspace folder.
You have exceeded the limit of available disk storage assigned to your workspace container. Usually, this includes everything stored outside of the /workspace folder (working directory). Your workspace container might be automatically reset if you do not free up storage space. This container reset will remove all files outside of the /workspace folder.
';
+ }
+ div.append('
' + warning_div + "
");
+ div.append(
+ '
To find the largest files and directories, we recommend to use the terminal with the following command: ncdu /. Alternatively, you can also use the Disk Usage Analyzer application accessible from Applications -> System within the VNC Desktop.
Please try to fix this issue with ungit or the terminal.
');
- return div
- }
+ /**
+ * @param {list} contains errormessage
+ * @return {string} The html code of a error dialog
+ */
+ errorDialog(errorMsg) {
+ var div = $("");
+ // div.append('
The following error was encountered:
');
+ div.append("
" + errorMsg + "
");
+ return div;
+ }
- };
+ gitErrorDialog(errorMsg) {
+ var div = $("");
+ div.append("
" + errorMsg + "
");
+ div.append(
+ "
Please try to fix this issue with ungit or the terminal.
"
+ );
+ return div;
+ }
+ }
- module.exports = SharedComponents; // export class in order to create an object of it in another file
-});
\ No newline at end of file
+ module.exports = SharedComponents; // export class in order to create an object of it in another file
+});
diff --git a/resources/jupyter/extensions/tooling-extension/jupyter_tooling/tooling-tree-widget.js b/resources/jupyter/extensions/tooling-extension/jupyter_tooling/tooling-tree-widget.js
index 76949a92..e4620910 100644
--- a/resources/jupyter/extensions/tooling-extension/jupyter_tooling/tooling-tree-widget.js
+++ b/resources/jupyter/extensions/tooling-extension/jupyter_tooling/tooling-tree-widget.js
@@ -1,108 +1,140 @@
-define(['base/js/namespace', 'jquery', 'base/js/dialog', 'base/js/utils', 'require', './tooling-shared-components'], function (Jupyter, $, dialog, utils, require, sharedComponents) {
-
- // -------- GLOBAL VARIABLES -----------------------
-
- var basePathRegex = "^(\/.+?)\/(tree|notebooks|edit|terminals)";
- var basePath = (window.location.pathname.match(basePathRegex) == null) ? "" : (window.location.pathname.match(basePathRegex)[1] + '/');
- if (!basePath) {
- basePath = "/"
- }
-
- // ----------- HANDLER -------------------------------
-
- var components = require('./tooling-shared-components');
- var components = new sharedComponents();
-
- //---------- REGISTER EXTENSION ------------------------
- /**
- * Adds the jupyter extension to the tree view (including the respective handler)
- */
- function load_ipython_extension() {
- // log to console
- console.info('Loaded Jupyter extension: Tooling Tree Widget')
-
- window.document.title = "Workspace Home"
-
- base_url = utils.get_body_data('base-url')
-
- btGitButton = '