Skip to content
This repository was archived by the owner on Oct 10, 2025. It is now read-only.

Commit 518ccf5

Browse files
committed
feat: research on potential tools for the new installer
1 parent 309fc30 commit 518ccf5

File tree

4 files changed

+338
-1
lines changed

4 files changed

+338
-1
lines changed

docs/redesign/phase2-analysis/07-summary-and-recommendations.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,4 +87,3 @@ proof of concept but must be addressed in a production-grade system.
8787
distributed-systems approach.
8888
- **Insecure Secret Handling**: Integrate a proper secrets management tool.
8989
- **Lack of Idempotency**: Ensure all automation scripts are fully idempotent.
90-
Lines changed: 149 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
# Integrated Toolchain Workflow Proposal
2+
3+
This document outlines a proposed workflow that combines the recommended tools
4+
(Ansible, Tera, SOPS, OpenTofu) into a cohesive, modern installer for the
5+
Torrust Tracker.
6+
7+
## 🎯 Design Goals
8+
9+
- **Automation**: Achieve 90%+ automation for a fresh deployment.
10+
- **Simplicity**: The user interaction should be as simple as `make deploy-local` or
11+
`make deploy-production`.
12+
- **Security**: Secrets are managed securely using SOPS and are never stored in plaintext in
13+
the repository.
14+
- **Flexibility**: The architecture supports multiple providers (libvirt, Hetzner, AWS) and
15+
environments (local, staging, production).
16+
- **Idempotency**: Running the deployment process multiple times results in the same state.
17+
18+
## Proposed Workflow
19+
20+
The deployment is broken down into four distinct stages, orchestrated by a root `Makefile`.
21+
22+
```mermaid
23+
graph TD
24+
subgraph User Interaction
25+
A[1. Configure Environment: <br> `local.env` or `production.env`] --> B{`make deploy`};
26+
end
27+
28+
subgraph Stage 1: Build & Package [Local Machine]
29+
B --> C{Tera <br> `render_configs.sh`};
30+
D[SOPS <br> `secrets.enc.yaml`] --> C;
31+
C --> E[Build Artifact <br> `build/deployment-package.tar.gz`];
32+
end
33+
34+
subgraph Stage 2: Provision Infrastructure [IaC]
35+
B --> F{OpenTofu <br> `tofu apply`};
36+
F --> G[Provisioned VM <br> (e.g., Hetzner Cloud)];
37+
F --> H[Ansible Inventory <br> `inventory.ini`];
38+
end
39+
40+
subgraph Stage 3: Deploy & Configure [Remote VM]
41+
E --> I{Ansible Playbook <br> `deploy_application.yml`};
42+
H --> I;
43+
I --> J[Copy Artifact & Unpack];
44+
J --> K[Configure System <br> (Firewall, Docker)];
45+
K --> L[Start Docker Services <br> `docker compose up`];
46+
end
47+
48+
subgraph Stage 4: Validation
49+
L --> M[Run Health Checks];
50+
end
51+
52+
style A fill:#f9f,stroke:#333,stroke-width:2px
53+
style E fill:#bbf,stroke:#333,stroke-width:2px
54+
style G fill:#bbf,stroke:#333,stroke-width:2px
55+
style L fill:#bbf,stroke:#333,stroke-width:2px
56+
```
57+
58+
### Stage 1: Build & Package (Local Machine)
59+
60+
This stage runs on the contributor's local machine and prepares a self-contained deployment
61+
artifact.
62+
63+
1. **User Configuration**: The user defines their target environment by creating a `.env` file
64+
(e.g., `cp env.template local.env`). This file contains all non-secret configuration
65+
values like domain names, VM size, and feature flags.
66+
67+
2. **Secrets Management (SOPS)**: All secrets (API keys, database passwords) are stored in an
68+
encrypted YAML file, `secrets.enc.yaml`. This file can be safely committed to the
69+
repository. The user decrypts it locally using their GPG key
70+
(`sops -d secrets.enc.yaml > secrets.dec.yaml`).
71+
72+
3. **Template Rendering (Tera)**: A build script (e.g., `scripts/build.sh`) uses **Tera** to
73+
render all necessary configuration files from templates (`*.tpl`).
74+
75+
- It combines values from the user's `.env` file and the decrypted `secrets.dec.yaml`.
76+
- **Output**: A `build/` directory containing the final, plaintext configuration files
77+
(`tracker.toml`, `compose.yaml`, `prometheus.yml`, etc.).
78+
79+
4. **Artifact Creation**: The `build/` directory is packaged into a single tarball
80+
(`build/deployment-package.tar.gz`). This artifact is the only thing that will be
81+
transferred to the target server.
82+
83+
### Stage 2: Provision Infrastructure (Remote)
84+
85+
This stage creates the remote server and prepares it for application deployment.
86+
87+
1. **Infrastructure as Code (OpenTofu)**: `make infra-apply` triggers **OpenTofu**.
88+
89+
- OpenTofu reads the provider configuration (e.g., `hetzner.tf`) and variables from the
90+
user's `.env` file.
91+
- **Crucially**, it uses a minimal `cloud-init` to install only what's necessary for
92+
Ansible to connect (e.g., Python).
93+
94+
2. **Inventory Generation**: After provisioning, OpenTofu outputs the IP address of the new
95+
VM into an **Ansible inventory file** (`inventory.ini`).
96+
97+
```ini
98+
[tracker]
99+
torrust-tracker-demo ansible_host=123.45.67.89
100+
```
101+
102+
### Stage 3: Deploy & Configure (Remote)
103+
104+
This stage uses Ansible to configure the provisioned server and launch the application.
105+
106+
1. **Ansible Playbook**: `make app-deploy` runs the main **Ansible playbook**
107+
(`ansible/deploy.yml`).
108+
109+
2. **Artifact Transfer**: The first step in the playbook is to copy the
110+
`build/deployment-package.tar.gz` to the remote server and unpack it into `/opt/torrust/`.
111+
112+
3. **System Configuration**: The playbook performs system-level setup:
113+
114+
- Installs Docker and Docker Compose.
115+
- Configures the firewall (UFW), SSH hardening (fail2ban), and system services.
116+
- Sets up persistent storage directories and permissions.
117+
118+
4. **Application Launch**: The final step is to run `docker compose up -d` using the
119+
rendered `compose.yaml` from the artifact. All services start up, configured with the
120+
correct secrets and settings.
121+
122+
### Stage 4: Validation & Monitoring
123+
124+
This final stage ensures the deployment is healthy and observable.
125+
126+
1. **Health Checks**: An Ansible task runs health checks against the deployed services:
127+
128+
- Pings API endpoints (`/api/health_check`).
129+
- Verifies database connectivity.
130+
- Checks that all containers are running.
131+
132+
2. **Monitoring**: The deployed stack includes Prometheus and Grafana for monitoring.
133+
- Prometheus scrapes metrics from the tracker.
134+
- Grafana provides dashboards for visualizing tracker performance.
135+
136+
## Tool Interaction Summary
137+
138+
- **Makefile**: The main entry point, orchestrating all stages.
139+
- **SOPS**: Manages secrets, decrypting them for use during the build stage.
140+
- **Tera**: Renders configuration templates using data from `.env` files and decrypted secrets.
141+
- **OpenTofu**: Provisions the raw infrastructure and prepares it for Ansible.
142+
- **Ansible**: Handles all configuration management on the target machine, ensuring the
143+
application is deployed consistently and correctly.
144+
145+
This workflow provides a clear separation of concerns:
146+
147+
- **Building**: Creating a deployable artifact from source (Tera).
148+
- **Provisioning**: Creating the required cloud infrastructure (OpenTofu).
149+
- **Configuration**: Applying environment-specific settings and secrets (SOPS + Ansible).
Lines changed: 187 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,187 @@
1+
# Tools Evaluation for Torrust Tracker Redesign
2+
3+
This document provides a high-level evaluation of potential tools that could fit into
4+
the new design of the Torrust Tracker deployment system.
5+
6+
## 1. Configuration Management: Ansible
7+
8+
### Overview
9+
10+
Ansible is an open-source automation tool that automates software provisioning,
11+
configuration management, and application deployment. It uses YAML for its playbooks,
12+
which makes it relatively easy to read and write.
13+
14+
### Potential Fit
15+
16+
- **Strengths**:
17+
18+
- **Agentless**: No need to install any client software on the managed nodes.
19+
- **Idempotent**: Ensures that running a playbook multiple times will result in the
20+
same system state.
21+
- **Large Community**: A vast number of pre-built modules and roles are available.
22+
- **Good for Orchestration**: Can manage complex workflows across multiple servers.
23+
24+
- **Weaknesses**:
25+
26+
- **Performance**: Can be slower than agent-based systems for a large number of
27+
nodes.
28+
- **YAML Complexity**: While easy to start, complex logic can make YAML files hard
29+
to manage.
30+
31+
- **Use Case for Torrust**:
32+
- Could replace many of the existing shell scripts for application configuration
33+
and deployment (`deploy-app.sh`).
34+
- Could manage the setup of the tracker, nginx, prometheus, etc., in a more
35+
structured way than cloud-init alone.
36+
37+
## 2. Build System: Meson
38+
39+
### Overview
40+
41+
Meson is an open-source build system that is designed to be both fast and
42+
user-friendly. It uses a simple, non-Turing-complete DSL to define builds.
43+
44+
### Potential Fit
45+
46+
- **Strengths**:
47+
48+
- **Fast**: Designed for speed, both in configuration and build execution.
49+
- **Cross-Platform**: Excellent support for building on different operating systems.
50+
- **User-Friendly**: The syntax is generally considered easier to learn than
51+
Makefiles or CMake.
52+
53+
- **Weaknesses**:
54+
55+
- **Less Common**: Not as widespread as Make or CMake, so there's a smaller
56+
community.
57+
58+
- **Use Case for Torrust**:
59+
- While the current project is more about deployment than building from source, if
60+
the new design involves compiling components (like the tracker itself or other
61+
tools), Meson could be a modern alternative to the current `Makefile`-based
62+
system. It might be overkill if we are only orchestrating Docker containers.
63+
64+
## 3. Templating Libraries
65+
66+
The current system uses `envsubst` for templating. While effective, more powerful
67+
templating engines could provide more flexibility.
68+
69+
### Potential Options
70+
71+
- **Jinja2 (via Python)**:
72+
73+
- **Strengths**: Very powerful, with loops, conditionals, filters, and macros.
74+
Widely used in tools like Ansible.
75+
- **Weaknesses**: Requires a Python environment to run.
76+
77+
- **Go Templates**:
78+
79+
- **Strengths**: Built into Go, so it's fast and has no external dependencies if we
80+
use Go for our tooling.
81+
- **Weaknesses**: Syntax can be more verbose than Jinja2.
82+
83+
- **Tera (Rust)**:
84+
85+
- **Strengths**: A powerful templating engine for Rust, inspired by Jinja2. If we
86+
build our deployment tools in Rust, this is a natural fit.
87+
- **Weaknesses**: Requires a Rust environment.
88+
89+
- **Use Case for Torrust**:
90+
- A better templating engine could simplify the generation of complex
91+
configuration files like `nginx.conf` or `prometheus.yml`, especially if we
92+
need to support multiple providers with different configurations.
93+
94+
## 4. Secrets Management
95+
96+
Currently, secrets are managed via environment variables in git-ignored files. This
97+
is a good baseline, but more robust solutions exist.
98+
99+
### Potential Options
100+
101+
- **HashiCorp Vault**:
102+
103+
- **Strengths**: A dedicated secrets management tool. Provides dynamic secrets,
104+
leasing, and auditing. The industry standard for secrets management.
105+
- **Weaknesses**: Adds another service to manage and maintain. Can be complex to set
106+
up.
107+
108+
- **SOPS (Secrets OPerationS)**:
109+
110+
- **Strengths**: Encrypts values in YAML/JSON files. The encrypted file can be
111+
committed to git, and decrypted at deployment time using KMS, GPG, etc.
112+
- **Weaknesses**: Requires setting up GPG keys or cloud KMS.
113+
114+
- **Ansible Vault**:
115+
116+
- **Strengths**: Integrated with Ansible. Allows encrypting variables or entire
117+
files within an Ansible project.
118+
- **Weaknesses**: Tied to using Ansible.
119+
120+
- **Use Case for Torrust**:
121+
- For the goal of a simple, automated deployment for a single server, a
122+
full-blown Vault instance is likely overkill.
123+
- **SOPS** could be a very good fit. It would allow us to have a single,
124+
encrypted `secrets.yaml` file per environment that can be safely stored in git,
125+
simplifying configuration management.
126+
127+
## 5. Infrastructure as Code (IaC)
128+
129+
The current system uses a combination of shell scripts and manual steps to provision
130+
infrastructure. Adopting a proper IaC tool would be a significant improvement.
131+
132+
### Potential Options
133+
134+
- **Terraform**:
135+
136+
- **Strengths**: The industry standard for IaC. Supports a vast number of
137+
providers. Large community and extensive documentation.
138+
- **Weaknesses**: Can be complex. The recent license change to BSL is a concern
139+
for some.
140+
141+
- **OpenTofu**:
142+
143+
- **Strengths**: A fork of Terraform, created in response to the license change.
144+
It is open-source and community-driven. It is a drop-in replacement for
145+
Terraform.
146+
- **Weaknesses**: Younger than Terraform, so the community is smaller.
147+
148+
- **Pulumi**:
149+
150+
- **Strengths**: Allows defining infrastructure using general-purpose programming
151+
languages like Python, Go, TypeScript, etc. This can be a significant
152+
advantage for teams that are more comfortable with these languages than with
153+
HCL.
154+
- **Weaknesses**: Smaller community than Terraform.
155+
156+
- **Use Case for Torrust**:
157+
- The goal is to automate the provisioning of the server, DNS records, and other
158+
infrastructure components. Both Terraform and OpenTofu are excellent choices for
159+
this.
160+
- Given the project's open-source nature, **OpenTofu** might be a better fit to
161+
avoid any future licensing issues.
162+
- Pulumi is also a strong contender, especially if the team prefers to use a
163+
general-purpose programming language.
164+
165+
## 6. Summary of Recommendations
166+
167+
Based on the evaluation, here is a summary of the recommended tools for the new
168+
Torrust Tracker deployment system:
169+
170+
- **Configuration Management**: **Ansible** is the recommended choice. Its
171+
agentless nature and idempotency are well-suited for this project. It can
172+
replace the existing shell scripts and provide a more structured way to manage
173+
the application configuration.
174+
175+
- **Build System**: **Meson** is a good option if the project requires compiling
176+
components. However, if the project is only orchestrating Docker containers, it
177+
might be overkill.
178+
179+
- **Templating**: **Tera** is the recommended choice if the deployment tools are
180+
built in Rust. Otherwise, **Jinja2** is a solid alternative.
181+
182+
- **Secrets Management**: **SOPS** is the recommended choice. It allows encrypting
183+
secrets in a file that can be committed to git, which simplifies configuration
184+
management.
185+
186+
- **Infrastructure as Code**: **OpenTofu** is the recommended choice. It is a
187+
drop-in replacement for Terraform and is open-source and community-driven.

project-words.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,7 @@ poweroff
104104
prereq
105105
privkey
106106
publickey
107+
Pulumi
107108
pwauth
108109
Pydantic
109110
pytest
@@ -126,6 +127,7 @@ showcerts
126127
somaxconn
127128
sshpass
128129
Taplo
130+
Tera
129131
Terratest
130132
testpass
131133
testuser

0 commit comments

Comments
 (0)