-
Notifications
You must be signed in to change notification settings - Fork 5
dev
If you wish to work further on the CTAT galaxy docker, read this page for some helpful pointers on getting started.
Docker must always be run as root.
Docker images are a collection of files, with a file called Dockerfile to orchestrate it all. Other files might include things that need to be copied into the running container, or are helper scripts for building or running the container.
The Dockerfile can contain references to other Docker repositories. The CTAT galaxy Dockerfile starts out with a reference to bgruening/docker-galaxy-stable, which is the standard Galaxy docker image. CTAT then makes tweaks to differentiate itself from the stock galaxy.
The rest of the Dockerfile acts somewhat like a shell script or a Makefile; it outlines a recipe of actions to take that will together create a working container. Here is a quick overview of the commands inside the Dockerfile:
# The FROM command acts like an import statement to reference another docker.
FROM dockerhubUser/dockerhubID:tags
# Set an environment variable inside the container with ENV.
# Any env that starts with GALAXY_CONFIG will actually change the behavior of galaxy similarly to changing
# galaxy.ini or galaxy.yaml. The part after the _ must match the name of the tag in those files.
ENV GALAXY_CONFIG_CLEANUP_JOB "never"
# WORKDIR acts like a cd to change the working directory
WORKDIR /galaxy-central
# Run is a sort of catch-all; anything you can do on the host machine, pretty much, you can RUN
RUN scriptname
# ADD and COPY will both copy files from the host to the container. Generally, copy is preferred, as it does not automatically uncompress files.
ADD file_on_host $GALAXY_ROOT/destination_inside_container
COPY file_on_host $GALAXY_ROOT/destination_inside_containerBesides the mastermind Dockerfile, there are other things in this repo:
This file is present to ensure that the Docker Galaxy uses conda appropriately while installing tools. It is probably not necessary to change; however, it may be removed if other behavior is desired.
<dependency_resolvers>
<!-- the default configuration, first look for dependencies installed from the toolshed -->
<tool_shed_packages />
<!-- then look for env.sh files in directories according to the "galaxy packages" schema.
These resolvers can take a base_path attribute to specify where to look for
package definitions, but by default look in the directory specified by tool_dependency_dir
in Galaxy's config/galaxy.ini -->
<galaxy_packages />
<!-- check whether the correct version has been installed via conda -->
<conda />
<!-- look for a "default" symlink pointing to a directory containing an
env.sh file for the package in the "galaxy packages" schema -->
<galaxy_packages versionless="true" />
<!-- look for any version of the dependency installed via conda -->
<conda versionless="true" />This file is meant to help prep the Resource Library on the Host system. It could use some TODO's like implementing other shell versions and/or making it POSIX compliant for portability. There are other potential conditions I might have missed.
Right now, the Docker image I'm basing off of is bgruening/docker-galaxy-stable. This hard-codes the Galaxy user as having uid 1450. This presents issues if your export directory doesn't belong to galaxy and does not have write permissions! The export directory is heavily used by Galaxy to build itself in, so it must write to this directory. Other possible workarounds are to rebuild the Docker image using a base Linux distro and putting Galaxy in it; however, this would require replicating/tweaking/writing from scratch all of the underlying support scripts and services, and would take me a long time to do as well as the Galaxy team has already done it. There is another Galaxy docker build that we could base off of that DOES have support for uid/gid on-the-fly:
The primary script that gets invoked in the Galaxy Docker container is 'startup'. This starts the worker processes of the Galaxy server, uWSGI. It coordinates with the nginx web server and the postgres database to provide a fully-functional production-ready Galaxy server. It handles job scheduling by starting up a SLURM daemon on the node as well.
Startup is located in // by default.
The root user is the main user if you run /bin/bash as an argument to docker run. The galaxy user, UID 1450, is the service account that starts galaxy proper.
** Caveat emptor ** Buyer beware. If you are using Jetstream and accessing a VM through the GUI (Atmosphere), if you format that volume as a logical drive for Docker, you will render your VM unbootable and the volume unrecoverable. Atmoshpere assumes all volumes are ext4 and will crash if these assumptions are violated.
If you are not using a mounted volume or Jetstream, you may wish at some point to slough the shackles of Overlay with its 10gb disk limit. You can use logical volumes to create larger disks inside Docker. It's a huge pain but it can be done:
systemctl stop docker
## Ok, I used emacs, i'm not this cool...
cat > /etc/docker/daemon.json <<<EOF
{
"storage-driver": "devicemapper",
"storage-opts": [
"dm.thinpooldev=/dev/mapper/docker-thinpool",
"dm.use_deferred_removal=true",
"dm.use_deferred_deletion=true"
]
}
EOF
systemctl start docker
docker info
# Complains about lvm loopback
#Umount the file system volumes from Jetstream
umount /vol_b
umount /vol_c
pvcreate /dev/sdb
#WARNING: ext4 signature detected on /dev/sdb at offset 1080. Wipe it? [y/n]: y
# Wiping ext4 signature on /dev/sdb.
# Physical volume "/dev/sdb" successfully created.
pvcreate /dev/sdc
vgcreate docker /dev/sdb
# Volume group "docker" successfully created
lvcreate --wipesignatures y -n thinpool docker -l 93%VG
lvcreate --wipesignatures y -n thinpoolmeta docker -l 3%VG
lvconvert -y --zero n -c 512K --thinpool docker/thinpool --poolmetadata docker/thinpoolmeta
cat > /etc/lvm/profile/docker-thinpool.profile <<<EOF
activation {
thin_pool_autoextend_threshold=80
thin_pool_autoextend_percent=20
}
EOF
lvchange --metadataprofile docker-thinpool docker/thinpool
lvs -o+seg_monitor
mkdir /var/lib/docker.bk
mv /var/lib/docker/* /var/lib/docker.bk
systemctl start docker
docker info
...
Storage Driver: devicemapper
Pool Name: docker-thinpool
...- Apt-get fails; it takes forever or doesn't find something you know exists
I've run into this in a fresh VM. Usually an apt-get update will fix it.
- Docker build complains about X11 when you are trying docker login, from Jetstream. The docker image on Jetstream has some issues. I solved this by removing some packages.
sudo apt-get purge docker-credential-pass
sudo apt-get purge docker-compose
sudo apt-get remove golang-docker-credential-helpers- Docker build eats up way too much space. I've found that on many systems, it doesn't matter how much disk you start with - Docker will consume it all. Remove excess docker build files by running:
sudo docker system prune -afBe careful, this will wipe out some of the prerequisite repositories from disk, and possibly take a lot longer to build your docker images just after.
While building, you can try to minimize the bloat by running:
sudo docker build --no-cache -t my-docker-name .my-docker-name can be whatever you would like to name your image.
There are issues with the Ubuntu for docker Jetstream image, similar to here: https://github.com/docker/compose/issues/6023