diff --git a/docs/how-to/deployment.md b/docs/how-to/deployment.md index 2d21361..4ddb4e5 100644 --- a/docs/how-to/deployment.md +++ b/docs/how-to/deployment.md @@ -74,18 +74,18 @@ When you have certificates from an external CA (Let's Encrypt, corporate PKI, et kibana_tls: true kibana_cert_source: external -kibana_tls_cert: /etc/pki/kibana/kibana.crt -kibana_tls_key: /etc/pki/kibana/kibana.key -kibana_tls_ca: /etc/pki/kibana/ca-chain.crt +kibana_tls_certificate_file: /etc/pki/kibana/kibana.crt +kibana_tls_key_file: /etc/pki/kibana/kibana.key +kibana_tls_ca_file: /etc/pki/kibana/ca-chain.crt -# Optional: key passphrase if the private key is encrypted -# kibana_tls_key_passphrase: "{{ vault_kibana_key_pass }}" +# Optional: passphrase for an encrypted private key or P12 file +# kibana_tls_certificate_passphrase: "{{ vault_kibana_key_pass }}" ``` The files must already exist on the Kibana host before running the playbook. The role configures Kibana to use them but does not manage the certificate lifecycle — renewal is your responsibility. !!! tip - If your external CA is not the same as the Elasticsearch CA, you also need to configure Elasticsearch to trust it. Add the CA certificate to `elasticsearch_tls_cacerts` on all ES nodes. + If your external CA is not the same as the Elasticsearch CA, you also need to configure Elasticsearch to trust it. Set `elasticsearch_tls_ca_certificate` on all ES nodes. ## Elasticsearch with external certificates @@ -95,14 +95,15 @@ For environments where certificates come from an external PKI: elasticsearch_cert_source: external # HTTP (client-facing) certificates -elasticsearch_http_tls_cert: /etc/pki/elasticsearch/http.crt +elasticsearch_http_tls_certificate: /etc/pki/elasticsearch/http.crt elasticsearch_http_tls_key: /etc/pki/elasticsearch/http.key -elasticsearch_http_tls_ca: /etc/pki/elasticsearch/ca-chain.crt # Transport (inter-node) certificates -elasticsearch_transport_tls_cert: /etc/pki/elasticsearch/transport.crt +elasticsearch_transport_tls_certificate: /etc/pki/elasticsearch/transport.crt elasticsearch_transport_tls_key: /etc/pki/elasticsearch/transport.key -elasticsearch_transport_tls_ca: /etc/pki/elasticsearch/ca-chain.crt + +# Shared CA for HTTP and transport +elasticsearch_tls_ca_certificate: /etc/pki/elasticsearch/ca-chain.crt ``` Each node needs its own certificate with the node's hostname or IP in the Subject Alternative Names (SAN). The transport certificate must include all node hostnames since nodes verify each other's identity during cluster formation. diff --git a/docs/introduction/index.md b/docs/introduction/index.md index e65d4af..bc32501 100644 --- a/docs/introduction/index.md +++ b/docs/introduction/index.md @@ -19,9 +19,9 @@ The collection provides six roles that cover each layer of the stack. They work | Category | Versions | |----------|----------| -| Debian | 11 (Bullseye), 12 (Bookworm), 13 (Trixie) | -| Ubuntu | 22.04 (Jammy), 24.04 (Noble) | -| Rocky Linux / RHEL | 8, 9, 10 | +| Debian | 12 (Bookworm), 13 (Trixie) | +| Ubuntu | 22.04 (Jammy), 24.04 (Noble), 26.04 (Resolute) | +| Rocky Linux / RHEL | 9, 10 | | Elastic Stack | 8.x, 9.x | | Ansible | 2.18+ | diff --git a/docs/reference/elasticsearch.md b/docs/reference/elasticsearch.md index 482c042..070020a 100644 --- a/docs/reference/elasticsearch.md +++ b/docs/reference/elasticsearch.md @@ -415,6 +415,30 @@ elasticsearch_extra_config: Keys that conflict with settings managed by dedicated role variables (like `cluster.name`, `network.host`, security/TLS settings, `bootstrap.memory_lock`) are silently filtered out, and the role emits a warning telling you to use the dedicated variable instead. +### Config-triggered restarts + +When a run changes `elasticsearch.yml` or JVM options, the Restart Elasticsearch handler fires. On multi-node clusters the role restarts nodes one at a time and waits for cluster health to recover between nodes; on single-node clusters it restarts in place. + +```yaml +elasticsearch_config_restart_strategy: rolling +elasticsearch_config_restart_flush: true +elasticsearch_config_restart_wait_status: green +elasticsearch_config_restart_health_retries: 50 +elasticsearch_config_restart_health_delay: 30 +elasticsearch_config_restart_node_retries: 200 +elasticsearch_config_restart_node_delay: 3 +``` + +`elasticsearch_config_restart_strategy` picks between `rolling` (default — restart one node at a time, gate on cluster health) and `direct` (legacy all-at-once restart from a normal handler). Single-node clusters always take the direct path regardless of this setting. + +`elasticsearch_config_restart_flush` runs a synced flush before each node restart during a rolling restart. Set to `false` only if you have a specific reason to skip it. + +`elasticsearch_config_restart_wait_status` is the minimum cluster health colour the role waits for before and after each node restart. `green` is strictly safer; set to `yellow` if you have unassigned replicas that are expected and you don't want the restart to block on them. + +`elasticsearch_config_restart_health_retries` and `elasticsearch_config_restart_health_delay` control how long the role waits for the cluster to regain the chosen health status between nodes. Defaults give ~25 minutes per node (50 × 30s), which is generous for large clusters with lots of shard recovery. + +`elasticsearch_config_restart_node_retries` and `elasticsearch_config_restart_node_delay` control how long the role waits for the node it just restarted to rejoin the cluster. Defaults give ~10 minutes per node (200 × 3s). + ### Rolling Upgrades The role validates the upgrade path before any work begins. When `elasticstack_release` is 9 or higher and Elasticsearch is currently installed, the role checks that the installed version is at least 8.19.0. If it finds an older 8.x version, the play fails immediately -- you must step through 8.19.x first. This matches [Elastic's official upgrade requirements](https://www.elastic.co/docs/deploy-manage/upgrade/deployment-or-cluster). @@ -455,6 +479,10 @@ The default heap formula is `min(max(memtotal_mb / 1024 / 2, 1), 30)` -- half of The role sets `nofile=65535` for the `elasticsearch` user via PAM (`/etc/security/limits.d/`). This is required for production but was historically unreliable in the RPM post-install scripts. Controlled by `elasticsearch_pamlimits` (default `true`). +### OS-level tuning + +`elasticsearch_os_tuning` (default `true`) applies the sysctl and kernel settings Elasticsearch expects in production: raises `vm.max_map_count` for the mmapfs directory (required for large shard counts), drops `vm.swappiness` to 1, tightens TCP retry counts for faster fault detection, and disables Transparent Huge Pages at runtime. The tuning is skipped automatically in container environments (`virtualization_type` in `docker`, `container`, `containerd`, `lxc`, `podman`), where these sysctls typically can't be set and should be inherited from the host. Set to `false` if your host is managed by a separate tuning policy and you don't want the role writing `/etc/sysctl.d/`. + ### JNA tmpdir workaround On systems where `/tmp` is mounted with `noexec`, Java Native Access fails to load native libraries. Set `elasticsearch_jna_workaround: true` to redirect JNA's temp directory to `{{ elasticsearch_datapath }}/tmp` via the sysconfig file (`/etc/default/elasticsearch` on Debian, `/etc/sysconfig/elasticsearch` on RedHat). @@ -494,14 +522,15 @@ In container environments (`virtualization_type` in `container`, `docker`, `lxc` ### Handler guards -The "Restart Elasticsearch" handler has four guards that prevent it from firing when a restart would be redundant or harmful: +Notifications of `Restart Elasticsearch` are dispatched to one of two paths depending on `elasticsearch_config_restart_strategy` and cluster size: a direct restart-in-place (single-node or explicitly direct), or a rolling restart that run_once-orchestrates node-by-node restarts across the cluster (multi-node + rolling, the default). Every handler in that dispatch chain applies the same five guard conditions to prevent a restart that would be redundant or harmful: -1. `elasticsearch_enable` must be true -2. NOT during a fresh install (service already started naturally) -3. NOT during security initialization (service already started) -4. NOT after a rolling upgrade (upgrade did its own restart) +1. NOT in check mode (`ansible_check_mode` is false) +2. `elasticsearch_enable` must be true +3. NOT during a fresh install (service already started naturally) +4. NOT during security initialization (service already started) +5. NOT after a rolling upgrade (upgrade did its own restart) -The handler also triggers a Kibana restart on all Kibana hosts (if `elasticstack_full_stack` is enabled) since Kibana may need to reconnect after an ES restart. This Kibana restart is skipped during CA renewal. +A separate handler on the same notification triggers a Kibana restart on all Kibana hosts (if `elasticstack_full_stack` is enabled) since Kibana may need to reconnect after an ES restart. The Kibana restart is skipped when the `renew_ca` tag is active or when `elasticstack_ca_will_expire_soon` is true, since those paths have their own coordinated Kibana restart. ### Double config write