The monitoring_base Ansible role automates the deployment and configuration of a complete monitoring stack using Podman and systemd.
This role installs and configures:
- Prometheus (metrics collection)
- Thanos (long-term storage & query layer)
- Grafana (visualization)
- Alertmanager (alert routing)
- Node Exporter (host metrics)
It supports TLS, S3-backed long-term storage, automatic Grafana provisioning, and full systemd service management.
roles/
└── monitoring_base/
├── tasks/
│ └── main.yaml
├── defaults/
│ └── main.yaml
├── handlers/
│ └── main.yaml
└── templates/
├── prometheus.service.j2
├── alertmanager.service.j2
├── grafana.service.j2
├── thanos.service.j2
├── thanos-store.service.j2
├── thanos-query.service.j2
├── thanos-compactor.service.j2
├── mon-pod.service.j2
├── node-exporter.service.j2
├── prometheus.yml.j2
├── alertmanager.yml.j2
├── grafana.ini.j2
└── objstore.yml.j2
- name: Gather the ip address of monitoring interface
ansible.builtin.set_fact:
monitoring_node_ip: "{{ ansible_facts[monitoring_node_iface].ipv4.address }}"Purpose:
- Retrieves the IP address of the interface defined by
monitoring_node_iface. - Stores it in
monitoring_node_ipfor configuration templates.
- name: Create podman pod for prometheus and thanos
containers.podman.podman_pod:
name: "{{ monitoring_pod }}"
state: createdPurpose:
- Creates a Podman pod (
mon) for Prometheus and Thanos components.
Each service is deployed using Jinja2 templates and triggers handlers on change:
Example:
- name: Deploy prometheus systemd unit file
ansible.builtin.template:
src: prometheus.service.j2
dest: "{{ prometheus_systemd_unit_file }}"
mode: "0644"
notify:
- Reload systemd
- Enable monitoring services
- Restart prometheus servicePurpose:
- Installs systemd service units.
- Reloads systemd if changed.
- Ensures services are enabled and restarted when necessary.
- Prometheus configuration
- Alertmanager configuration
- Grafana configuration
- Thanos object storage configuration
- Grafana provisioning (datasources and dashboards)
Templates are rendered using default and overridden variables.
- mon-pod
- prometheus
- alertmanager
- grafana
- thanos
- thanos-store
- thanos-query
- thanos-compactor
- node-exporter
All services:
- Are enabled on boot
- Restart automatically when configuration changes
- Run under dedicated system users
| Variable | Default |
|---|---|
prometheus_release |
v3.2.0 |
prometheus_dir |
/var/prometheus |
prometheus_config_dir |
/etc/prometheus |
prometheus_enable_tls |
true |
prometheus_storage_block_duration |
2h |
| Variable | Default |
|---|---|
grafana_release |
11.5.2 |
grafana_dir |
/var/lib/grafana |
grafana_config_dir |
/etc/grafana |
grafana_admin_initial_password |
{{ vault_grafana_admin_initial_password }} |
| Variable | Default |
|---|---|
alertmanager_release |
v0.28.0 |
alertmanager_dir |
/var/lib/alertmanager |
monitoring_alertmanager_enabled |
true |
| Variable | Default |
|---|---|
thanos_release |
v0.35.1 |
thanos_s3_bucket |
rpi-thanos-metrics-s3-eu-central-1 |
thanos_aws_access_key |
{{ vault_thanos_aws_access_key }} |
thanos_aws_access_secret |
{{ vault_thanos_aws_access_secret }} |
| Variable | Default |
|---|---|
node_exporter_release |
v1.9.0 |
- Reload systemd
- Restart prometheus service
- Restart alertmanager service
- Restart grafana service
- Restart thanos service
- Restart thanos store service
- Restart thanos query service
- Restart thanos compactor service
- Restart node-exporter service
- Enable monitoring services
Handlers ensure services are properly restarted and enabled after configuration changes.
Using requirements.yml:
roles:
- name: monitoring_base
src: https://github.com/zbitmanis/role_monitoring_base.git
scm: gitInstall:
ansible-galaxy role install -r requirements.ymlansible-vault create group_vars/monitoring/vault.ymlExample:
vault_grafana_admin_initial_password: "StrongPassword"
vault_thanos_aws_access_key: "ACCESS_KEY"
vault_thanos_aws_access_secret: "SECRET_KEY"- name: Deploy monitoring stack
hosts: monitoring
become: true
roles:
- monitoring_baseRun:
ansible-playbook -i inventory site.yml --ask-vault-passAfter deployment:
- Prometheus → https://:9090
- Grafana → http://:3000
- Thanos Query → http://:19090
MIT