Skip to content

Commit c711a42

Browse files
authored
Merge maser into feature/config-ntp-timezone-maxcstate (#6742)
2 parents 1383d1a + 059f0b2 commit c711a42

File tree

83 files changed

+3876
-888
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

83 files changed

+3876
-888
lines changed

doc/content/design/numa.md

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
---
2+
title: NUMA
3+
layout: default
4+
design_doc: true
5+
revision: 1
6+
status: proposed
7+
---
8+
9+
# NUMA
10+
11+
NUMA stands for Non-Uniform Memory Access and describes that RAM access
12+
for CPUs in a large system is not equally fast for all of them. CPUs
13+
are grouped into so-called nodes and each node has fast access to RAM
14+
that is considered local to its node and slower access to other RAM.
15+
Conceptually, a node is a container that bundles some CPUs and RAM and
16+
there is an associated cost when accessing RAM in a different node. In
17+
the context of CPU virtualisation assigning vCPUs to NUMA nodes is an
18+
optimisation strategy to reduce memory latency. This document describes
19+
a design to make NUMA-related assignments for Xen domains (hence, VMs)
20+
visible to the user. Below we refer to these assignments and
21+
optimisations collectively as NUMA for simplicity.
22+
23+
NUMA is more generally discussed as
24+
[NUMA Feature](../toolstack/features/NUMA/index.md).
25+
26+
27+
## NUMA Properties
28+
29+
Xen 4.20 implements NUMA optimisation. We want to expose the following
30+
NUMA-related properties of VMs to API clients, and in particualar
31+
XenCenter. Each one is represented by a new field in XAPI's `VM_metrics`
32+
data model:
33+
34+
* RO `VM_metrics.numa_optimised`: boolean: if the VM is
35+
optimised for NUMA
36+
* RO `VM_metrics.numa_nodes`: integer: number of NUMA nodes of the host
37+
the VM is using
38+
* MRO `VM_metrics.numa_node_memory`: int -> int map; mapping a NUMA node
39+
(int) to an amount of memory (bytes) in that node.
40+
41+
Required NUMA support is only available in Xen 4.20. Some parts of the
42+
code will have to be managed by patches.
43+
44+
## XAPI High-Level Implementation
45+
46+
As far as Xapi clients are concerned, we implement new fields in the
47+
`VM_metrics` class of the data model and surface the values in the CLI
48+
via `records.ml`; we could decide to make `numa_optimised` visible by
49+
default in `xe vm-list`.
50+
51+
Introducing new fields requires defaults; these would be:
52+
53+
* `numa_optimised`: false
54+
* `numa_nodes`: 0
55+
* `numa_node_memory`: []
56+
57+
The data model ensures that the values are visible to API clients.
58+
59+
## XAPI Low-Level Implementation
60+
61+
NUMA properties are observed by Xenopsd and Xapi learns about them as
62+
part of the `Client.VM.stat` call implemented by Xenopsd. Xapi makes
63+
these calls frequently and we will update the Xapi VM fields related to
64+
NUMA simply as part of processing the result of such a call in Xapi.
65+
66+
For this to work, we extend the return type of `VM.stat` in
67+
68+
* `xenops_types.ml`, type `Vm.state`
69+
70+
with three fields:
71+
72+
* `numa_optimised: bool`
73+
* `numa_nodes: int`
74+
* `numa_node_memory: (int, int64) list`
75+
76+
matching the semantics from above.
77+
78+
## Xenopsd Implementation
79+
80+
Xenopsd implements the `VM.stat` return value in
81+
82+
* `Xenops_server_sen.get_state`
83+
84+
where the three fields would be set. Xenopsds relies on bindings to Xen to
85+
observe NUMA-related properties of a domain.
86+
87+
Given that NUMA related functionality is only available for Xen 4.20, we
88+
probably will have to maintain a patch in xapi.spec for compatibility
89+
with earlier Xen versions.
90+
91+
The (existing) C bindings and changes come in two forms: new functions
92+
and an extension of a type used by and existing function.
93+
94+
```ocaml
95+
external domain_get_numa_info_node_pages_size : handle -> int -> int
96+
= "stub_xc_domain_get_numa_info_node_pages_size"
97+
```
98+
99+
Thia function reports the number of NUMA nodes used by a Xen domain
100+
(supplied as an argument)
101+
102+
```ocaml
103+
type domain_numainfo_node_pages = {
104+
tot_pages_per_node : int64 array;
105+
}
106+
external domain_get_numa_info_node_pages :
107+
handle -> int -> int -> domain_numainfo_node_pages
108+
= "stub_xc_domain_get_numa_info_node_pages"
109+
```
110+
111+
This function receives as arguments a domain ID and the number of nodes
112+
this domain is using (acquired using `domain_get_numa_info_node_pages`)
113+
114+
The number of NUMA nodes of the host (not domain) is reported by
115+
`Xenctrl.physinfo` which returns a value of type `physinfo`.
116+
117+
```diff
118+
index b4579862ff..491bd3fc73 100644
119+
--- a/tools/ocaml/libs/xc/xenctrl.ml
120+
+++ b/tools/ocaml/libs/xc/xenctrl.ml
121+
@@ -155,6 +155,7 @@ type physinfo =
122+
capabilities : physinfo_cap_flag list;
123+
max_nr_cpus : int;
124+
arch_capabilities : arch_physinfo_cap_flags;
125+
+ nr_nodes : int;
126+
}
127+
```
128+
129+
We are not reporting `nr_nodes` directly but use it to determine the
130+
value of `numa_optimised` for a domain/VM:
131+
132+
numa_optimised =
133+
(VM.numa_nodes = 1)
134+
or (VM.numa_nodes < physinfo.Xenctrl.nr_nodes)
135+
136+
### Details
137+
138+
The three new fields that become part of type `VM.state` are updated as
139+
part of `get_state()` using the primitives above.
140+
141+
142+

doc/content/xapi/alarms/index.md

Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
+++
2+
title = "How to set up alarms"
3+
linkTitle = "Alarms"
4+
+++
5+
6+
# Introduction
7+
8+
In XAPI, alarms are triggered by a Python daemon located at `/opt/xensource/bin/perfmon`.
9+
The daemon is managed as a systemd service and can be configured by setting parameters in `/etc/sysconfig/perfmon`.
10+
11+
It listens on an internal Unix socket to receive commands. Otherwise, it runs in a loop, periodically requesting metrics from XAPI. It can then be configured to generate events based on these metrics. It can monitor various types of XAPI objects, including `VMs`, `SRs`, and `Hosts`. The configuration for each object is defined by writing an XML string into the object's `other-config` key.
12+
13+
The metrics used by `perfmon` are collected by the `xcp-rrdd` daemon. The `xcp-rrdd` daemon is a component of XAPI responsible for collecting metrics and storing them as Round-Robin Databases (RRDs).
14+
15+
A XAPI plugin also exists, providing the functions `refresh` and `debug_mem`, which send commands through the Unix socket. The `refresh` function is used when an `other-config` key is added or updated; it triggers the daemon to reread the monitored objects so that new alerts are taken into account. The `debug_mem` function logs the objects currently being monitored into `/var/log/user.log` as a dictionary.
16+
17+
# Monitoring and alarms
18+
19+
## Overview
20+
21+
- To get the metrics, `perfmon` requests XAPI by calling: `http://localhost/rrd_updates?session_id=<ref>&start=1759912021&host=true&sr_uuid=all&cf=AVERAGE&interval=60`
22+
- Different consolidation functions can be used like **AVERAGE**, **MIN**, **MAX** or **LAST**. See the details in the next sections for specific objects and how to set it.
23+
- Once retrieve, `perfmon` will check all its triggers and generate alarms if needed.
24+
25+
## Specific XAPI objects
26+
### VMs
27+
28+
- To set an alarm on a VM, you need to write an XML string into the `other-config` key of the object. For example, to trigger an alarm when the CPU usage is higher than 50%, run:
29+
```sh
30+
xe vm-param-set uuid=<UUID> other-config:perfmon='<config> <variable> <name value="cpu_usage"/> <alarm_trigger_level value="0.5"/> </variable> </config>'
31+
```
32+
33+
- Then, you can either wait until the new configuration is read by the `perfmon` daemon or force a refresh by running:
34+
```sh
35+
xe host-call-plugin host-uuid=<UUID> plugin=perfmon fn=refresh
36+
```
37+
38+
- Now, if you generate some load inside the VM and the CPU usage goes above 50%, the `perfmon` daemon will create a message (a XAPI object) with the name **ALARM**. This message will include a _priority_, a _timestamp_, an _obj-uuid_ and a _body_. To list all messages that are alarms, run:
39+
```sh
40+
xe message-list name=ALARM
41+
```
42+
43+
- You will see, for example:
44+
```sh
45+
uuid ( RO) : dadd7cbc-cb4e-5a56-eb0b-0bb31c102c94
46+
name ( RO): ALARM
47+
priority ( RO): 3
48+
class ( RO): VM
49+
obj-uuid ( RO): ea9efde2-d0f2-34bb-74cb-78c303f65d89
50+
timestamp ( RO): 20251007T11:30:26Z
51+
body ( RO): value: 0.986414
52+
config:
53+
<variable>
54+
55+
<name value="cpu_usage"/>
56+
57+
<alarm_trigger_level value="0.5"/>
58+
59+
</variable>
60+
```
61+
- where the _body_ contains all the relevant information: the value that triggered the alarm and the configuration of your alarm.
62+
63+
- When configuring you alarm, your XML string can:
64+
- have multiple `<variable>` nodes
65+
- use the following values for child nodes:
66+
* **name**: what to call the variable (no default)
67+
* **alarm_priority**: the priority of the messages generated (default '3')
68+
* **alarm_trigger_level**: level of value that triggers an alarm (no default)
69+
* **alarm_trigger_sense**:'high' if alarm_trigger_level is a max, otherwise 'low'. (default 'high')
70+
* **alarm_trigger_period**: num seconds of 'bad' values before an alarm is sent (default '60')
71+
* **alarm_auto_inhibit_period**: num seconds this alarm disabled after an alarm is sent (default '3600')
72+
* **consolidation_fn**: how to combine variables from rrd_updates into one value (default is 'average' for 'cpu_usage', 'get_percent_fs_usage' for 'fs_usage', 'get_percent_log_fs_usage' for 'log_fs_usage','get_percent_mem_usage' for 'mem_usage', & 'sum' for everything else)
73+
* **rrd_regex** matches the names of variables from (xe vm-data-sources-list uuid=$vmuuid) used to compute value (only has defaults for "cpu_usage", "network_usage", and "disk_usage")
74+
75+
- Notice that `alarm_priority` will be the priority of the generated `message`, 0 being low priority.
76+
77+
### SRs
78+
79+
- To set an alarm on an SR object, as with VMs, you need to write an XML string into the `other-config` key of the SR. For example, you can run:
80+
```sh
81+
xe sr-param-set uuid=<UUID> other-config:perfmon='<config><variable><name value="physical_utilisation"/><alarm_trigger_level value="0.8"/></variable></config>'
82+
```
83+
- When configuring you alarm, the XML string supports the same child elements as for VMs
84+
85+
### Hosts
86+
87+
- As with VMs ans SRs, alarms can be configured by writing an XML string into an `other-config` key. For example, you can run:
88+
```sh
89+
xe host-param-set uuid=<UUID> other-config:perfmon=\
90+
'<config><variable><name value="cpu_usage"/><alarm_trigger_level value="0.5"/></variable></config>'
91+
```
92+
93+
- The XML string can include multiple <variable> nodes allowed
94+
- The full list of supported child nodes is:
95+
* **name**: what to call the variable (no default)
96+
* **alarm_priority**: the priority of the messages generated (default '3')
97+
* **alarm_trigger_level**: level of value that triggers an alarm (no default)
98+
* **alarm_trigger_sense**: 'high' if alarm_trigger_level is a max, otherwise 'low'. (default 'high')
99+
* **alarm_trigger_period**: num seconds of 'bad' values before an alarm is sent (default '60')
100+
* **alarm_auto_inhibit_period**:num seconds this alarm disabled after an alarm is sent (default '3600')
101+
* **consolidation_fn**: how to combine variables from rrd_updates into one value (default is 'average' for 'cpu_usage' & 'sum' for everything else)
102+
* **rrd_regex** matches the names of variables from (xe host-data-source-list uuid=<UUID>) used to compute value (only has defaults for "cpu_usage", "network_usage", "memory_free_kib" and "sr_io_throughput_total_xxxxxxxx") where that last one ends with the first eight characters of the SR UUID)
103+
104+
- As a special case for SR throughput, it is also possible to configure a Host by writing XML into the `other-config` key of an SR connected to it. For example:
105+
```sh
106+
xe sr-param-set uuid=$sruuid other-config:perfmon=\
107+
'<config><variable><name value="sr_io_throughput_total_per_host"/><alarm_trigger_level value="0.01"/></variable></config>'
108+
```
109+
- This only works for that specific variable name, and `rrd_regex` must not be specified.
110+
- Configuration done directly on the host (variable-name, sr_io_throughput_total_xxxxxxxx) takes priority.
111+
112+
## Which metrics are available?
113+
114+
- Accepted name for metrics are:
115+
- **cpu_usage**: matches RRD metrics with the pattern `cpu[0-9]+`
116+
- **network_usage**: matches RRD metrics with the pattern `vif_[0-9]+_[rt]x`
117+
- **disk_usage**: match RRD metrics with the pattern `vbd_(xvd|hd)[a-z]+_(read|write)`
118+
- **fs_usage**, **log_fs_usage**, **mem_usage** and **memory_internal_free** do not match anything by default.
119+
- By using `rrd_regex`, you can add your own expressions. To get a list of available metrics with their descriptions, you can call the `get_data_sources` method for [VM](https://xapi-project.github.io/new-docs/xen-api/classes/vm/), for [SR](https://xapi-project.github.io/new-docs/xen-api/classes/sr/) and also for [Host](https://xapi-project.github.io/new-docs/xen-api/classes/host/).
120+
- A python script is provided at the end to get data sources. Using the script we can, for example, see:
121+
```sh
122+
# ./get_data_sources.py --vm 5a445deb-0a8e-c6fe-24c8-09a0508bbe21
123+
124+
List of data sources related to VM 5a445deb-0a8e-c6fe-24c8-09a0508bbe21
125+
cpu0 | CPU0 usage
126+
cpu_usage | Domain CPU usage
127+
memory | Memory currently allocated to VM
128+
memory_internal_free | Memory used as reported by the guest agent
129+
memory_target | Target of VM balloon driver
130+
...
131+
vbd_xvda_io_throughput_read | Data read from the VDI, in MiB/s
132+
...
133+
```
134+
- You can then set up an alarm when the data read from a VDI exceeds a certain level by doing:
135+
```
136+
xe vm-param-set uuid=5a445deb-0a8e-c6fe-24c8-09a0508bbe21 \
137+
other-config:perfmon='<config><variable> \
138+
<name value="disk_usage"/> \
139+
<alarm_trigger_level value="10"/> \
140+
<rrd_regex value="vbd_xvda_io_throughput_read"/> \
141+
</variable> </config>'
142+
```
143+
- Here is the script that allows you to get data sources:
144+
```python
145+
#!/usr/bin/env python3
146+
147+
import argparse
148+
import sys
149+
import XenAPI
150+
151+
152+
def pretty_print(data_sources):
153+
if not data_sources:
154+
print("No data sources.")
155+
return
156+
157+
# Compute alignment for something nice
158+
max_label_len = max(len(data["name_label"]) for data in data_sources)
159+
160+
for data in data_sources:
161+
label = data["name_label"]
162+
desc = data["name_description"]
163+
print(f"{label:<{max_label_len}} | {desc}")
164+
165+
166+
def list_vm_data(session, uuid):
167+
vm_ref = session.xenapi.VM.get_by_uuid(uuid)
168+
data_sources = session.xenapi.VM.get_data_sources(vm_ref)
169+
print(f"\nList of data sources related to VM {uuid}")
170+
pretty_print(data_sources)
171+
172+
173+
def list_host_data(session, uuid):
174+
host_ref = session.xenapi.host.get_by_uuid(uuid)
175+
data_sources = session.xenapi.host.get_data_sources(host_ref)
176+
print(f"\nList of data sources related to Host {uuid}")
177+
pretty_print(data_sources)
178+
179+
180+
def list_sr_data(session, uuid):
181+
sr_ref = session.xenapi.SR.get_by_uuid(uuid)
182+
data_sources = session.xenapi.SR.get_data_sources(sr_ref)
183+
print(f"\nList of data sources related to SR {uuid}")
184+
pretty_print(data_sources)
185+
186+
187+
def main():
188+
parser = argparse.ArgumentParser(
189+
description="List data sources related to VM, host or SR"
190+
)
191+
parser.add_argument("--vm", help="VM UUID")
192+
parser.add_argument("--host", help="Host UUID")
193+
parser.add_argument("--sr", help="SR UUID")
194+
195+
args = parser.parse_args()
196+
197+
# Connect to local XAPI: no identification required to access local socket
198+
session = XenAPI.xapi_local()
199+
200+
try:
201+
session.xenapi.login_with_password("", "")
202+
if args.vm:
203+
list_vm_data(session, args.vm)
204+
if args.host:
205+
list_host_data(session, args.host)
206+
if args.sr:
207+
list_sr_data(session, args.sr)
208+
except XenAPI.Failure as e:
209+
print(f"XenAPI call failed: {e.details}")
210+
sys.exit(1)
211+
finally:
212+
session.xenapi.session.logout()
213+
214+
215+
if __name__ == "__main__":
216+
main()
217+
```
218+

0 commit comments

Comments
 (0)