|
| 1 | +# Secret Management Strategy |
| 2 | + |
| 3 | +## 1. Context |
| 4 | + |
| 5 | +The Torrust Tracker application requires the management of sensitive information (secrets) to |
| 6 | +operate correctly. These secrets include database credentials, API tokens, and other sensitive |
| 7 | +parameters. |
| 8 | + |
| 9 | +In the previous Proof of Concept (PoC), secrets were managed through a `.env` file stored on |
| 10 | +the host virtual machine (VM). This file was used by Docker Compose to inject secrets into |
| 11 | +running containers and was also sourced by host-level scripts (e.g., for database backups). |
| 12 | + |
| 13 | +This approach, while simple, stores secrets in plaintext, which has security implications. As |
| 14 | +we move to a production-grade design, we must formalize our secret management strategy, |
| 15 | +balancing security, operational simplicity, and the technical constraints of our chosen |
| 16 | +services. |
| 17 | + |
| 18 | +This decision is documented in |
| 19 | +**[ADR-004: Configuration Approach - Files vs Environment Variables](../adr/004-configuration-approach-files-vs-environment-variables.md)**. |
| 20 | + |
| 21 | +## 2. The Challenge: Service-Specific Configuration |
| 22 | + |
| 23 | +While the twelve-factor app methodology advocates for strict configuration via environment |
| 24 | +variables, not all services support this pattern. A key challenge in our stack is |
| 25 | +**Prometheus**, which does not support runtime environment variable substitution in its |
| 26 | +configuration files. |
| 27 | + |
| 28 | +As noted in ADR-004, this means that any secrets required by Prometheus (such as an API |
| 29 | +token for scraping a protected endpoint) must be embedded directly into the `prometheus.yml` |
| 30 | +file at deployment time. This technical constraint forces us to adopt a hybrid configuration |
| 31 | +strategy. |
| 32 | + |
| 33 | +## 3. Proposed Strategy: Centralized Plaintext Configuration |
| 34 | + |
| 35 | +We will adopt a strategy that centralizes secrets in plaintext files within a protected |
| 36 | +directory on the host VM. This approach acknowledges the limitations of our stack while |
| 37 | +providing a clear, maintainable, and operationally simple system. |
| 38 | + |
| 39 | +1. **Primary Secrets File (`.env`):** |
| 40 | + |
| 41 | + - A primary `.env` file will be located at `/var/lib/torrust/compose/.env`. |
| 42 | + - This file will contain the majority of secrets, such as database credentials, |
| 43 | + Grafana passwords, and the tracker's admin token. |
| 44 | + - Docker Compose will use this file to inject secrets into the relevant service |
| 45 | + containers (Tracker, MySQL, Grafana, etc.) at runtime. |
| 46 | + |
| 47 | +2. **Service-Specific Configuration Files:** |
| 48 | + |
| 49 | + - For services that do not support environment variables for secrets (i.e., |
| 50 | + Prometheus), the secrets will be embedded directly into their configuration files |
| 51 | + (e.g., `/var/lib/torrust/prometheus/etc/prometheus.yml`). |
| 52 | + - These configuration files will be generated from templates during the `app-deploy` |
| 53 | + process, where secret values are substituted from the main environment |
| 54 | + configuration. |
| 55 | + |
| 56 | +3. **Containerized Backups:** |
| 57 | + - To avoid exposing database credentials to the host's `cron` system, database |
| 58 | + backups will be performed by a dedicated, short-lived `torrust-backup` container. |
| 59 | + - This container will be launched by a simple `cron` job on the host |
| 60 | + (`docker compose run --rm torrust-backup`). |
| 61 | + - The backup container will receive the necessary database credentials from the |
| 62 | + `.env` file via Docker Compose, ensuring that secrets do not need to be read or |
| 63 | + managed by host-level scripts. |
| 64 | + |
| 65 | +### Benefits of this Strategy |
| 66 | + |
| 67 | +- **Operational Simplicity:** Easy for administrators to manage. Secrets can be rotated |
| 68 | + by editing the `.env` file and restarting services. |
| 69 | +- **Self-Contained System:** The VM is fully self-sufficient after deployment. The |
| 70 | + installer machine can be discarded. |
| 71 | +- **Handles Exceptions:** The strategy explicitly accounts for services like Prometheus |
| 72 | + that cannot use environment variables for secrets. |
| 73 | + |
| 74 | +### The Prometheus Precedent |
| 75 | + |
| 76 | +The decision to embed secrets directly into configuration files for certain services is not |
| 77 | +merely a workaround but aligns with the design philosophy of major tools in our stack. The |
| 78 | +Prometheus development team has explicitly stated their position on this matter, confirming |
| 79 | +that the intended and supported method for providing secrets is through the configuration |
| 80 | +file itself. |
| 81 | + |
| 82 | +In a long-standing GitHub issue, |
| 83 | +**[Support for secrets set in ENV variables #504]**, the Prometheus team |
| 84 | +clarifies that they have chosen to support only one method for configuration to maintain |
| 85 | +simplicity and consistency. When asked about supporting environment variables for secrets, a |
| 86 | +core developer stated: |
| 87 | + |
| 88 | +[Support for secrets set in ENV variables #504]: https://github.com/prometheus/alertmanager/issues/504 |
| 89 | + |
| 90 | +> The chosen approach is to put them in the config file. There's many many possible ways |
| 91 | +> to provide configuration, for sanity we have to choose just one of them. |
| 92 | +
|
| 93 | +This official stance validates our hybrid approach. It confirms that for services like |
| 94 | +Prometheus, managing secrets via file-based configuration is the expected pattern, not an |
| 95 | +anti-pattern. Our strategy, therefore, is consistent with the operational principles of the |
| 96 | +tools we use. |
| 97 | + |
| 98 | +## 4. Security Considerations |
| 99 | + |
| 100 | +This strategy involves storing secrets in plaintext on the VM's filesystem. It is crucial |
| 101 | +to understand the security implications. |
| 102 | + |
| 103 | +If an attacker gains root-level or `torrust` user access to the host VM, they can |
| 104 | +compromise the application's secrets. The security of this model relies on the security of |
| 105 | +the host VM itself. |
| 106 | + |
| 107 | +An attacker with access to the host could: |
| 108 | + |
| 109 | +1. **Read Plaintext Files:** Directly read the contents of |
| 110 | + `/var/lib/torrust/compose/.env` and any other configuration files containing secrets. |
| 111 | +2. **Inspect Running Containers:** Use `docker inspect` on any running container to view |
| 112 | + all the environment variables that were passed to it. |
| 113 | +3. **Execute Commands in Containers:** Use `docker exec` to gain a shell inside a running |
| 114 | + container and then use commands like `env` or `printenv` to list all environment |
| 115 | + variables. |
| 116 | + |
| 117 | +This strategy prioritizes operational simplicity and compatibility with our service stack |
| 118 | +over achieving the highest possible level of security (which would require an external |
| 119 | +secrets manager like HashiCorp Vault). The primary defense is hardening the host VM itself |
| 120 | +through measures like: |
| 121 | + |
| 122 | +- A restrictive firewall (`ufw`). |
| 123 | +- SSH key-only authentication. |
| 124 | +- Intrusion detection tools (`fail2ban`). |
| 125 | +- Regular security updates. |
| 126 | + |
| 127 | +This approach is deemed an acceptable risk for the project's scope, providing a |
| 128 | +significant improvement over the PoC by centralizing configuration and containerizing |
| 129 | +auxiliary tasks like backups. |
0 commit comments