This repository contains resources and documentation for DTC-VC competitors, such as container images and local deployment scripts for the DTC-VC Testbed, a sample competitor container image, and the automated scoring library and service.
- A Linux host with recent kernel/distro (Ubuntu 22.04, etc.)
- NVIDIA GPU and latest Linux drivers installed (required for GPU access in containers)
- The latest version of Docker
- Multiple hosts are recommended for best performance
This process involves a lot of tricky steps, so please ensure you read them carefully. Copy/Pasting all the steps will not work, there is manual configuration.
Please reference the official Docker Swarm documentation and tutorial for more details. The following only covers configuration required for for the DTC-VC testbed and the essential steps needed to initialize a swarm.
All hosts participating in the swarm with GPUs require special configured to allow GPU access from containers launched in the swarm. Open and modify the Docker Daemon configuration on each node (usually located at /etc/docker/daemon.json) to include the line "default-runtime": "nvidia". This ensures that all containers use the Nvidia container runtime by default since Docker Swarm does not support configuring the container runtime. See this link for more details. For example (your daemon.json file may be different):
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"args": [],
"path": "nvidia-container-runtime"
}
}
}When the configuration has been updated, run the command below to restart Docker:
sudo service docker restartConfigure Kernel Modules
The containerized NFS server used in swarm deployments requires loading NFS kernel modules:
sudo modprobe nfs nfsdTo persist the module configuration add the following lines to /etc/modules-load.d/modules.conf:
nfs
nfsd
AppArmor Profile (optional)
If your environment uses AppArmor then additional configuration is required to allow the containerized NFS server used in swarm deployments to run in privileged mode. First, install the required packages:
sudo apt-get install apparmor-utils lxcNext, run the following commands to :
cd path/to/dtcvc/deployment
sudo apparmor_parser -r -W config/erichough-nfs.apparmor
sudo modprobe nfs
sudo modprobe nfsdTo persist the AppArmor configuration do the following:
sudo cp erichough-nfs.apparmor /etc/apparmor.d/
sudo systemctl reload apparmorFirst, at least one host must be initialized as a manager to create the swarm. Be sure to specify the IP address of an interface on the manager which can communicate with all other hosts in the swarm:
docker swarm init --advertise-addr <MANAGER-IP>Then, on all other hosts, run docker swarm join with the arguments shown in the output of the swarm init command. For example:
docker swarm join \
--token SWMTKN-1-49nj1cmql0jkz5s954yi3oex3nedyz0fb0xx14ie39trti4wxv-8vxv8rssmk743ojnwacrr2e7c \
192.168.99.100:2377Once all hosts have been added to the swarm, on a manager node run the following to see all the nodes in the swarm:
docker node lsNode labels can be used to constrain the swarm nodes to which deployed services are assigned. For example, to add a gpu label to a node, run the following command on the manager node:
docker node update --label-add gpu=rtx4090 some-node-hostnameTo remove a label:
docker node update --label-rm gpu some-node-hostnameDTC-VC testbed deployments require some additional configuration and services running on the manager node:
- An overlay network for the testbed
- A containerized NFS server to allow services on other hosts to access the
deployment/datafolder on the manager - A ROS2 discovery server to allow ROS2 nodes to pass messages (the default discovery mechanism relies on UDP broadcast which swarm overlay networks do not support)
All of these services can be started with the command:
./deployment/scripts/swarm-start-services.shAnd if you need to stop, or reset them, run the command:
./deployment/scripts/swarm-stop-services.shContainer images must be built or loaded onto each host participating in the swarm.
Using prebuilt container images
- Download the DTC-VC container images package (named
dtcvc-images-rYYYYMMDD.tgz), for this example it's path is~/path/to/dtcvc-images.tgz, replace with the actual location accordingly. - Load images by issuing the following:
docker load -i ~/path/to/dtcvc-images.tgz
- Use the
deployment/scripts/load-images.shscript to load images onto a remote host:./deployment/scripts/load-images.sh ~/path/to/dtcvc-images.tgz user@some-ssh-host
Building container images locally
- Download and extract the DTC-VC simulator package (named
dtcvc-simulator-rYYYYMMDD.zip) into a local folder, such as~/path/to/dtcvc-simulator - Use
deployment/scripts/build-images.shto build the images:./deployment/scripts/build-images.sh ~/path/to/dtcvc-simulator/Carla-0.10.0-Linux-Development
Each deployment of the DTC-VC testbed will perform a single run with the configured scenario. The default scenario configuration file is located at deployment/data/config/testbed/scenario.yml. Modify this file to change the simulation runtime or the loaded environment.
The swarm deployment is configured using compose files, two examples are provided:
deployment/swarm/single-agent.yml: A deployment with a single simulator and agentdeployment/swarm/solely-sim.yml: A deployment with a single simulator and no agents, useful for testing the simulator
Set placement constraints
Swarm placement constraints are used to control on which nodes services are allocated. If no constraints are specified for a service then it may be allocated on any available node in the swarm. For example, constraints can be used to ensure that services which require a GPU are only allocated to hosts with a GPU. To configure the simulator service in the single-agent.yml deployment add a contraint on the gpu label (see above for examples of assigning labels to nodes):
simulator1:
...
deploy:
placement:
constraints:
- node.labels.gpu == rtx4090
...To constrain a service to the manager node use:
- node.role == managerTo constrain a service to any node but the manager use:
- node.role != managerTo constrain a service to a node with a specific hostname:
- node.hostname == isr-ace-triboxThe deployment/scripts/deploy.sh script encapsulates the docker stack deploy command and sets required environment variables used in the swarm deployment compose files. For example, to deploy the single-agent.yml compose file:
./swarm-deploy.sh ./swarm/single-agent.ymlThis will immediately exit and should not produce any obvious crashes or errors.
To check the status of the deployed testbed services use the docker stack ps command. Add the --no-trunc argument to show full error messages:
docker stack ps single-agentUse the docker service logs command to see the logs of the deployed testbed services. Add a -f at the end to make it stream the output:
docker service logs single-agent_scorekeeper
docker service logs single-agent_simulator1
docker service logs single-agent_agent1The testbed services will automatically start running the configured scenario once all services are ready. Once the scenario finishes the testbed services will all shutdown, but there is no way to automatically cleanup the deployment. To stop and remove the deployment, use the docker stack rm command:
docker stack rm single-agentUsing the DTC Testing Container
The DTC Testing Container provides utilities for interacting with the simulator during development. See the documentation for more details.
Final reports generated by the scorekeeper service can be found in the testbed log output directory: deployment/data/output/testbed/logs/final_score_report_{timestamp_in_ns}.json. A single report is generated per run of the simulation.
The final score can be found at the top of the final report. Casualty assessments are mappings of casualty IDs to reports, an aggregate, and a score and are listed after the total score.
docs: Additional documentationdeployment: Container images, configuration, and scripts used for deploymentsdtcvc: Python and ROS2 packages used by the DTC-VC testbeddtcvc/lib/dtcvc-scoring: The Python library used for scoring triage reportsdtcvc/ros/src: ROS2 packages used by the testbed
dtcvc-competitor: Reference implementation for a competitor agent demonstrating ROS2 topics used in a deployment and triage report submission- A sample Dockerfile for running the reference implementation can be found in
deployment/images/dtcvc-agent - The sample competitor container is provided for reference only. Competitors are not required to base their implementation on the sample, however they must ensure any custom container they submit behaves similarly to the sample.
- A sample Dockerfile for running the reference implementation can be found in