This repo contains Docker Compose containers to run the MediaWiki software.
Clone the repo. Then create and start the containers:
cd docker-bugsigdb.org
docker compose up -d --no-start
# copy a database dump (*.sql or *.sql.gz) to the __initdb directory if needed
docker run --rm -v <images/directory>:/source -v <volume_prefix>_images:/target busybox cp -a /source/. /target/
# copy .env.example to .env and modify as needed (see the Settings section)
cp .env.example .env
docker compose up -dWait for the completion of the build and initialization process and access it via http://localhost:8081 in a browser.
Running docker compose up -d will start the containers:
db- MySQL official container, used as the database backend for MediaWiki.web- Apache/MediaWiki container (Taqasta) with PHP 7.4 and MediaWiki 1.39.xredis- Redis is an open-source key-value store used as the cache backendmatomo- Matomo analytics instance (disabled by default, requiresmatomoprofile to be set)
elasticsearch- Advanced search enginevarnish- A reverse caching proxy and HTTP acceleratorrestic- (production only) Modern backup container performing incremental backups to both S3 storage and Google Cloud Storage (GCS)updateEFO- (production only) A Python script that updates EFO links on glossary pages automatically
Settings can be adjusted via the .env file created from .env.example. Environment and other general configuration are in the compose.yml and environment-specific overrides (compose.staging.yml, compose.PRODUCTION.yml) files, in the environment sections.
Additionally:
_resourcesdirectory: contains favicon, logo, styles, and customizations for the chameleon skin and additional MediaWiki extensions._settings/LocalSettings.php: contains settings for MediaWiki core and extensions. If customization is required, change them there.- For production backups with restic, create the file
./secrets/restic-GCS-account.json, containing your Google Cloud Storage credentials.
The database used is the official MySQL 8 container.
The most important environment variable is MYSQL_ROOT_PASSWORD; it specifies the password set for the MySQL root superuser account.
If changed, ensure corresponding database passwords (MW_DB_PASS in the web section) are updated accordingly.
COMPOSE_PROFILESenables services with the selected profiles. Available profiles:matomoMW_SITE_SERVERconfigures $wgServer; set this to the server host and include the protocol likehttps://bugsigdb.orgMW_SITE_NAMEconfigures $wgSitenameMW_SITE_LANGconfigures $wgLanguageCodeMW_DEFAULT_SKINconfigures $wgDefaultSkinMW_ENABLE_UPLOADSconfigures $wgEnableUploadsMW_ADMIN_USERconfigures the default administrator usernameMW_ADMIN_PASSWORDconfigures the default administrator passwordMW_DB_NAMEspecifies the database name MediaWiki usesMW_DB_USERspecifies the DB user MediaWiki uses; default isrootMW_DB_PASSspecifies the DB user password; must match your MySQL passwordMW_PROXY_SERVERSconfigures $wgSquidServers for reverse proxies (typicallyvarnish:80)MW_MAIN_CACHE_TYPEconfigures $wgMainCacheType. (CACHE_REDISis recommended)MW_LOAD_EXTENSIONSprovided as comma-separated list of MediaWiki extensions to load during container startupMW_LOAD_SKINScomma-separated list of MediaWiki skins available for useMW_SEARCH_TYPEconfigures the search backend (typicallyCirrusSearch)MW_NCBI_TAXONOMY_API_KEY,MW_RECAPTCHA_SITE_KEY,MW_RECAPTCHA_SECRET_KEYoptional/requested third-party API keysMW_ENABLE_SITEMAP_GENERATORenables sitemap generator script on production (true/false)
The restic container handles scheduled backups (weekly/monthly retention settings) through incremental snapshots:
RESTIC_PASSWORD- password to encrypt backupAWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY- access credentials for S3-compatible storageBACKUP_CRON,CHECK_CRON- cron schedule for automatic backup and check operations
This Python-based container automatically updates EFO terms and links in the glossary:
UPDATE_EFO_BOT_PASSWORD- authentication password for bot accountUPDATE_EFO_PAUSE- update frequency in seconds (default 86400 sec / 24h)
Note: the script may produce extra load to the wiki so it's recommended to schedule it for nigh time, also worth to
consider that it takes time to process all the pages so average script cycle is ~4-8 hours. You can change sleep
timeouts via -z parameter.
Matomo instance provides website analytics:
- Default admin username:
admin MATOMO_PASSWORD- sets the initial password for matomo administration panelMATOMO_MYSQL_ROOT_PASSWORD- MySQL root password for matomo databaseMATOMO_MYSQL_PASSWORD- MySQL user password for matomo database
Varnish cache container used as reverse proxy and front-end cache server:
VARNISH_SIZE- amount of RAM to dedicate to caching (e.g.,100m)
BASIC_USERNAME- basic http usernameBASIC_PASSWORD- basic http password (hashed usingopenssl passwd -apr1)
Used to just binding a certain directory or file from the host inside the container. We use:
./__initdbdirectory is used to pass the database dump for stack initialization
When matomo profile is enabled the stack expects you to have /matomo/data and /matomo/__initdb directories
on the host.
Data that must be persistent across container life cycles are stored in docker volumes:
db_data(MySQL databases and working directories, attached todbservice)elasticsearch_data(Elasticsearch nodes, attached toelasticsearchservice)web_data(Miscellaneous MediaWiki files and directories that must be persistent by design, attached towebservice )images(MediaWiki upload directory, attached towebservice and used inresticservice (read-only))redis_data(Redis cache)varnish_data(Varnish cache)matomo_data(Analytics data)restic_data(Space mounted to theresticservice for operations with snapshots)
Docker containers write files to volumes using internal users.
Log files are stored in the _logs directory.
Make a full backup of the wiki, including both the database and the files. While the upgrade scripts are well-maintained and robust, things could still go awry.
cd <docker stack directory>
docker-compose exec db /bin/bash -c 'mysqldump --all-databases -uroot -p"$MYSQL_ROOT_PASSWORD" 2>/dev/null | gzip | base64 -w 0' | base64 -d > backup_$(date +"%Y%m%d_%H%M%S").sql.gz
docker-compose exec web /bin/bash -c 'tar -c $MW_VOLUME $MW_HOME/images 2>/dev/null | base64 -w 0' | base64 -d > backup_$(date +"%Y%m%d_%H%M%S").tarFor picking up the latest changes, stop, rebuild and start containers:
cd <docker stack directory>
git pull
docker-compose up -dThe upgrade process is fully automated and includes the launch of all necessary maintenance scripts.
The image is configured to automatically purge the homepage once per hour. You can configure this using the following environment variables:
MW_CACHE_PURGE_PAUSE=3600
MW_CACHE_PURGE_PAGE=Main_Page
The deployment is organized as follows:
compose.yml: common container definitions, typically used in development environmentcompose.staging.yml: staging-specific overrides (hostnames, basic auth)compose.PRODUCTION.yml: production-specific overrides including health-checks, backups, and special scripts
Before running docker compose commands, link your environment configuration as follows:
ln -sf compose.staging.yml compose.override.yml # staging environment
# OR
ln -sf compose.PRODUCTION.yml compose.override.yml # production environmentThen use docker compose up -d as usual. Docker Compose automatically merges the files.
To work around T333776 we run maintenance/updateSpecialPages.php once a day. This ensures the count of active users on Special:CreateAccount stays up to date.
- bugsigdb.org: A Comprehensive Database of Published Microbial Signatures
- BugSigDB issue tracker: Report bugs or feature requests for bugsigdb.org
- BugSigDBExports: Hourly data exports of bugsigdb.org
- Stable data releases: Periodic manually-reviewed stable data releses on Zenodo
- bugsigdbr: R/Bioconductor access to published microbial signatures from BugSigDB
- Curation issues: Report curation issues, requests studies to be added
- bugSigSimple: Simple analyses of BugSigDB data in R
- BugSigDBStats: Statistics and trends of BugSigDB
- BugSigDBPaper: Reproduces analyses of the Nature Biotechnology publication
- community-bioc Slack Team: Join #bugsigdb channel