Skip to content

Admin guide

Oriol edited this page Nov 7, 2025 · 3 revisions

[[TOC]]

EAR Components

EAR is composed of five main components:

  • Node Manager (EARD): It is a Linux service which provides the basic node power monitoring and job accounting. It also offers an API to be used for third-parties (e.g., other EAR components) to to make priviledged operations. It must have root access to the node (usually all compute nodes) where it will be running.
  • Database Manager (EARDBD): A Linux service (it normally runs in a service node) which caches data to be stored in a database reducing the number of queries. We currently support MariaDB and PostgresSQL. This compoment is not needed to be enabled/used if don't use such database services to report EAR data.
  • Global Manager (EARGM): A Linux service (it normally runs in a service node) which provides cluster-level support (e.g., powercap). It needs access to all nodes where a Node Manager is runningi the cluster.
  • EAR Library (EARL): A Job Manager (distributed as a shared object) which provides job/application -level monitoring and optimization.
  • Scheduler plug-in: A SLURM SPANK plug-in and a PBS Pro Hook which provide support for using EAR job accounting and loading EARL transparently for users.

For a more detailed information about EAR components, visit the Architecture page.

Quick Installation Guide

This section provides summary of needed steps for compiling and installing EAR. The complete guide is left into [another section](Installation from source), but it is recommended to read first this section since it contains useful information about how and what gets compiled and installed.

Check out first whether your system satisfies all requirements, then check that you have Autoconf version 2.69 or later. You can then bootstrap the build system:

autoreconf -i

As commented in the overview, the EAR Library might be loaded along with MPI applications thanks to the EAR Loader library. The latter detects the application symbols at runtime and loads the right Library. Therefore, you should compile at least two versions of the EAR Library:

This is an example to configure EAR to be compiled for both versions:

# Configure EAR to compile the non-MPI version of the EAR Library
./configure --disable-mpi \
	MPICC=mpicc MAKE_NAME=nompi

# Configure EAR to compile the MPI version of the Library
./configure MPICC=mpicc MPICC_FLAGS="-O2 -g" MAKE_NAME=impi

The above example assumes your MPI Library is Intel MPI. If you want to compile EARL for another MPI flavour check out this section.

EAR currently does not support GNU make parallel builds, so the above example must be run in the source code root directory. For the same reason, the configure script support a variable called MAKE_NAME, so it generates a Makefile called Makefile.<MAKE_NAME variable value>. Therefore you can call make program targeting each configuration Makefile generated program targeting each configuration Makefile generated.

The flag --disable-mpi is used for configuring the non-MPI version of the EAR Library.

Note that even when configuring for this use case, MPICC variable is also set. This is because the EAR Loader still needs MPI headers for checking whether the application being running is MPI, and configure finds out these by checking this variable.

After completing the previous steps, you can compile and install EAR by targeting each of the generated Makefiles. the following example takes the Makefile suffixes used in the previous one:

# Compile and install EAR. The EAR Library version installed
# will be for supporting non-MPI applications.
make -f Makefile.nompi
make -f Makefile.nompi install

# sysconfdir installation needs another target
make -f Makefile.nompi etc.install
make -f Makefile.nompi doc.install

# Compile and install just the
# MPI version of the EAR Library
make -f Makefile.impi full
make -f Makefile.impi earl.install

In the above example, some non-standard targets are used. etc.install target is needed for installing all configuration, module and service files to be used later when configuring EAR. The full target is the equivalent of calling first clean and then all targets. Finally, earl.install is used for installing the EAR Library, since we are just compiling again because we want another version of the Library installed along with the previous one.

EAR Makefiles include a specific target for each component, supporting full or partial updates:

Target Description
install Reinstalls all the files except etc and doc.
earl.install Reinstalls only the EARL.
eard.install Reinstalls only the EARD.
earplug.install Reinstalls only the EAR SLURM plugin.
eardbd.install Reinstalls only the EARDBD.
eargmd.install Reinstalls only the EARGMD.
reports.install Reinstalls only report plugins.

Here is an example of a bash script summarizing the information provided until now, compiling and installing EAR with two versions of the Library: one supporting Intel MPI applications and other one supporting any non-MPI application:

#!/bin/bash

# This script bootstraps, configures, compiles and installs
# EAR with two versions of the Library: one supporting Intel MPI applications
# and other one supporting any non-MPI application
#
# Requirements:
# - GNU Autoconf
# - GNU make
# - A modern C compiler
# - Intel MPI compiler and Library

EAR_INSTALL_PATH= # Set the root location of your installation
EAR_TMP= # Set the location of temporary directories and files
EAR_ETC= # Set the location of configuration and services files.

my_CFLAGS="-O2 -g"

# Bootstrap the configure script
autoreconf -i

# Configure EAR to compile the non-MPI version of the EAR Library
./configure --disable-mpi --prefix=$EAR_INSTALL_PATH \
	MPICC=mpicc CC=gcc CC_FLAGS="$my_CFLAGS" \
  EAR_TMP=$EAR_TMP EAR_ETC=$EAR_ETC \
  MAKE_NAME=nompi

# Compile and install EAR. The EAR Library version installed
# will be for supporting non-MPI applications.
make -f Makefile.nompi
make -f Makefile.nompi install

# sysconfdir installation needs another target
make -f Makefile.nompi etc.install
make -f Makefile.nompi doc.install

# Configure EAR to compile the MPI version of the Library.
./configure --prefix=$EAR_INSTALL_PATH \
    MPICC=mpicc MPICC_FLAGS="$my_CFLAGS" \
    CC=gcc CC_FLAGS="$my_CFLAGS" \
    EAR_TMP=$EAR_TMP EAR_ETC=$EAR_ETC \
    MAKE_NAME=impi

# Compile and install just the
# MPI version of the EAR Library
make -f Makefile.impi full
make -f Makefile.impi earl.install

After compiling and installing following the previous step, you should have the following directories under configure's --prefix flag used path:

  • bin: Including commands and tools.
  • sbin: Includes EAR services binaries.
  • etc: Includes templates and examples for EAR service files, the ear.conf file, the EAR module and so.
  • lib: Includes all libraries and plugins.
  • include
  • man: Man pages.

Inside lib directory, apart from plug-ins, you should see at least three files.

  • libearld.so: This is the EAR Loader.
  • libear.so: This is the EAR Library compiled with Intel MPI symbols. See the next section if you need support for other MPI implementations.
  • libear.gen.so: This is the EAR Library compiled without MPI symbols. The .gen extension is added automatically when setting --disable-mpi flag.

Supporting more than one MPI implementation

Many systems have different MPI implementations installed, so users can choose which one fits better for their applications. Even all of them provide the same interface, each one has some specific symbols not specified in the standard. Therefore you need to install an EAR Library version for each MPI flavor you want to support.

In order to help the EAR Loader to load the proper Library version, coliving libraries must be named different. This is accomplished by providing MPI_VERSION variable to configure. This variable sets an extension of the libear.so shared object compiled, so when the EAR Loader detects the MPI version of the application, it can easily load the proper Library. You need to set a specific value to variable value depending on the MPI implementation you are going to compile following this table:

Implementation MPI_VERSION value EARL Name
Intel MPI not required libear.so (default)
MVAPICH not required libear.so (default)
OpenMPI ompi libear.ompi.so
Fujitsu MPI fujitsu libear.fujitsu.so
Cray MPI cray libear.cray.so

Note that in the example used until now this variable was not used. This is because for this MPI version the EAR Loader does not find for an extension, and it is the continuation of the first EARL design and it was not changed.

So, if you would like to add to your previous EAR installation the support for, let's say, OpenMPI, you should type the following:

# Configure EAR to compile Library supporting OpenMPI applications
# Note: mpicc must point to an OpenMPI installation
./configure MPICC=mpicc MPICC_FLAGS="-O2 -g" MAKE_NAME=openmpi MPI_VERSION=ompi

make -f Makefile.openmpi full
# The below line assumes you already have installed all other components,
# i.e., `make -f Makefile.<extension> install`.
make -f Makefile.openmpi earl.install

This is an example of a bash script which summarizes the configuration, compilation and installation of EAR providing support for multiple MPI implementations:

#!/bin/bash

# This script bootstraps, configures, compiles and installs
# EAR with two versions of the Library: one supporting Intel MPI applications
# and other one supporting any non-MPI application
#
# Requirements:
# - GNU Autoconf
# - GNU make
# - A modern C compiler
# - Intel MPI compiler and Library

EAR_INSTALL_PATH= # Set the root location of your installation
EAR_TMP= # Set the location of temporary directories and files
EAR_ETC= # Set the location of configuration and services files.

my_CFLAGS="-O2 -g"

# Bootstrap the configure script
autoreconf -i

# Replace with an Intel MPI module
module load intel-mpi-module

# Configure EAR to compile the non-MPI version of the EAR Library
./configure --disable-mpi --prefix=$EAR_INSTALL_PATH \
	MPICC=mpicc CC=gcc CC_FLAGS="$my_CFLAGS" \
  EAR_TMP=$EAR_TMP EAR_ETC=$EAR_ETC \
  MAKE_NAME=nompi

# Compile and install EAR. The EAR Library version installed
# will be for supporting non-MPI applications.
make -f Makefile.nompi
make -f Makefile.nompi install

# sysconfdir installation needs another target
make -f Makefile.nompi etc.install
make -f Makefile.nompi doc.install

# Configure EAR to compile the MPI version of the Library.
./configure --prefix=$EAR_INSTALL_PATH \
    MPICC=mpicc MPICC_FLAGS="$my_CFLAGS" \
    CC=gcc CC_FLAGS="$my_CFLAGS" \
    EAR_TMP=$EAR_TMP EAR_ETC=$EAR_ETC \
    MAKE_NAME=impi

# Compile and install just the
# MPI version of the EAR Library
make -f Makefile.impi full
make -f Makefile.impi earl.install

# Configure EAR to compile Library supporting OpenMPI applications
# Note: mpicc must point to an OpenMPI installation
module unload intel-mpi-module
module load openmpi-module

./configure --prefix=$EAR_INSTALL_PATH \
    MPICC=mpicc MPICC_FLAGS="$my_CFLAGS" \
    CC=gcc CC_FLAGS="$my_CFLAGS" \
    EAR_TMP=$EAR_TMP EAR_ETC=$EAR_ETC \
    MAKE_NAME=openmpi MPI_VERSION=ompi

make -f Makefile.openmpi full
make -f Makefile.openmpi earl.install

Deployment and validation

Monitoring: Compute node and DB

Prepare the configuration

Either installing from sources or rpm, EAR installs a template for ear.conf file in $EAR_ETC/ear/ear.conf.template and $EAR_ETC/ear/ear.conf.full.template. The full version includes all fields. Copy only one as $EAR_ETC/ear/ear.conf and update with the desired configuration. Go to the configuration section to see how to do it. The ear.conf is used by all the services. It is recommended to have in a shared folder to simplify the changes in the configuration.

EAR module

Install and load EAR module to enable commands. It can be found at $EAR_ETC/module. You can add ear module whan it is not in standard path by doing module use $EAR_ETC/module and then module load ear.

EAR Database

Create EAR database with edb_create, installed at $EAR_INSTALL_PATH/sbin. The edb_create -p command will ask you for the DB root password. If you get any problem here, check first whether the node where you are running the command can connect to the DB server. In case problems persists, execute edb_create -o to report the specific SQL queries generated. In case of trouble, contact with ear-support@bsc.es or open in issue.

Energy models

EAR uses a power and performance model based on systems signatures. These system signatures are stored in coefficient files.

Before starting EARD, and just for testing, it is needed to create a dummy coefficient file and copy in the coefficients path, by default placed at$EAR_ETC/coeffs. Use the coeffs_null application from tools section.

EAR version 4.1 does not require null coefficients.

EAR services

Create soft links or copy EAR service files to start/stop services using system commands such as systemctl in the services folder. EAR service files are generated at $EAR_ETC/systemd and they can usually be placed in $(ETC)/systemd.

  • EARD must be started on compute nodes.
  • EARDBD must be started on service nodes (can be any node with DB access).

Enable and start EARDs and EARDBDs via services (e.g., sudo systemctl start eard, sudo systemctl start eardbd). EARDBD and EARD outputs can be found at $EAR_TMP/eardbd.server.log and $EAR_TMP/eard.log respectively when DBDaemonUseLog and NodeUseLog options are set to 1 in the ear.conf file, respectively. Otherwise, their outputs are generated at stderr and can be seen using the journalctl command (i.e., journalctl -u eard).

By default, a certain level of verbosity is set. It is not recommended to modify it but you can change it by modifying the value of constants in file src/common/output/output_conf.h.

Quick validation

Check that EARDs are up and running correctly with econtrol --status (note that daemons will take around a minute to correctly report energy and not show up as an error in econtrol). EARDs create a per-node text file with values reported to the EARDBD (local to compute nodes). In case there are problems when running econtrol, you can also find this file at $EAR_TMP/nodename.pm_periodic_data.txt.

Check that EARDs are reporting metrics to database with ereport. ereport -n all should report the total energy sent by each daemon since the setup.

Monitoring: EAR plugin

Slurm

  • Set up EAR's SLURM plugin (see the configuration section for more information).

It is recommented to create a soft link to the $EAR_ETC/slurm/ear.plugstack.conf file in the /etc/slurm/plugstack.conf.d directory to simplify the EAR plugin management.

For a first test it is recommened to set default=off in the ear.plugstack.conf to disable the automatic loading of the EAR library.

PBS

  • Set up EAR PBS Hook (see the configuration section for more information).

For a first test it is recommened to set default=off in the ear_hook_conf.ini to disable the automatic loading of the EAR library.

EAR scheduler plugins validation

At this point you must be able to see EAR options when doing, for example, srun --help. You must see something like below as part of the output. The EAR plugin must be enabled at login and compute nodes.

[user@hostname ~]$ srun --help
Usage: srun [OPTIONS(0)... [executable(0) [args(0)...]]] [ : [OPTIONS(N)...]] executable(N) [args(N)...]

Parallel run options:
...

Constraint options:
...

Consumable resources related options:
...

Affinity/Multi-core options: (when the task/affinity plugin is enabled)
...

Options provided by plugins:
      --ear=on|off            Enables/disables Energy Aware Runtime Library
      --ear-policy=type       Selects an energy policy for EAR
                              {type=default,gpu_monitoring,monitoring,min_energ-
                              y,min_time,gpu_min_energy,gpu_min_time}
      --ear-cpufreq=frequency Specifies the start frequency to be used by EAR
                              policy (in KHz)
      --ear-policy-th=value   Specifies the threshold to be used by EAR policy
                              (max 2 decimals) {value=[0..1]}
      --ear-user-db=file      Specifies the file to save the user applications
                              metrics summary 'file.nodename.csv' file will be
                              created per node. If not defined, these files
                              won't be generated.
      --ear-verbose=value     Specifies the level of the
                              verbosity{value=[0..1]}; default is 0
      --ear-learning=value    Enables the learning phase for a given P_STATE
                              {value=[1..n]}
      --ear-tag=tag           Sets an energy tag (max 32 chars)

...

Help options:
  -h, --help                  show this help message
      --usage                 display brief usage message

Other options:
  -V, --version               output version information and exit

In PBS, to see EAR options run ear-hook-help. You must see something like below as part of the output. The EAR must be loaded.

For PBS:

[user@hostname ~]$ module load ear
[user@hostname ~]$ ear-hook-help
  • Submit one application via the scheduler and check that it is correctly reported to the database with eacct command.

Note that only privileged users can check other users’ applications.

  • Submit one MPI application (corresponding with the version you have compiled) with sbatch --ear=on or qsub -v "EAR=on" and check that now the output of eacct includes the Library metrics.
  • Set default=on to set the EAR Library loading by default at ear.plugstack.conf or in hook_config.ini.

At this point, you can use EAR for monitoring and accounting purposes but it cannot use the power policies offered by EARL. To enable them, first perform a learning phase and compute node coefficients. See the EAR learning phase wiki page. For the coefficients to be active, restart daemons.

Important Reloading daemons will NOT make them load coefficients, restarting the service is the only way.

Installing from RPM

EAR includes the specification files to create an RPM from an already existing installation. Once created, it can be included in the compute nodes images. It is recommened only when no more changes are expected on the installation or when your compute fleet has ephimeral storage and EAR is installed in a non-shared file system.

The spec file is placed at etc/rpms/specs/ear.spec and it is generated from etc/rpms/specs/ear.spec.in at configuration time. The RPM can be part of the system image. Visit the Requirements page for a quick overview of the requirements.

Execute the rpmbuild.sh script to create the EAR RPM file. This is script is located at etc/rpms and it is created from etc/rpms/rpmbuild.sh.in at configuration time. Run it from its location. The rpm file will be located at $HOME/rpmbuild/RPMS. You can install it by typing:

rpm -ivh <ear_rpm_filename>.rpm

You can also use the --nodeps if your dependency test fails. Type rpm -e <ear_rpm_filename> to uninstall.

Installation content

The *.in configuration files are compiled into etc/ear/ear.conf.template and etc/ear/ear.full.conf.template, etc/module/ear, etc/slurm/ear.plugstack.conf and various etc/systemd/ear*.service. You can find more information in the configuration page. Below table describes the complet heriarchy of the EAR installation:

Directory Content / description
/usr/lib Libraries and the scheduler plugin.
/usr/lib/plugins EAR plugins.
/usr/bin EAR commands.
/usr/bin/tools EAR tools for coefficients computation.
/usr/sbin Privileged components: EARD, EARDBD, EARGMD.
/etc/ear Configuration files templates.
/etc/ear/coeffs Folder to store coefficient files.
/etc/module EAR module.
/etc/slurm EAR SLURM plugin configuration file.
/etc/systemd EAR service files.

RPM requirements

EAR uses some third party libraries. EAR RPM will not ask for them when installing but they must be available in LD_LIBRARY_PATH when running an application and you want to use EAR. Depending on the RPM, different version must be required for these libraries:

Library Minimum version References
MPI - -
MySQL* 15.1 MySQL or MariaDB
PostgreSQL* 9.2 PostgreSQL
Autoconf 2.69 Website
GSL 1.4 Website

* Just one of them required.

These libraries are not required, but can be used to get additional functionality or metrics:

Library Minimum version References
SLURM 17.02.6 Website
PBS** 2021 PBSPro or OpenPBS
CUDA/NVML 7.5 CUDA
CUPTI** 7.5 CUDA
Likwid 5.2.1 Likwid
FreeIPMI 1.6.8 FreeIPMI
OneAPI/L0** 1.7.9 OneAPI
LibRedFish** 1.3.6 LibRedFish

** These will be available in next release.

Also, some drivers has to be present and loaded in the system when starting EAR:

Driver File Kernel version References
CPUFreq kernel/drivers/cpufreq/acpi-cpufreq.ko 3.10 Information
Open IPMI kernel/drivers/char/ipmi/*.ko 3.10 Information

Starting Services

The best way to execute all EAR daemon components (EARD, EARDBD, EARGM) is by the unit services method.

NOTE EAR uses a MariaDB/MySQL server. The server must be started before EAR services are executed.

The way to launch the EAR daemons is via unit services. The generated unit services for the EAR Daemon, EAR Global Manager Daemon and EAR Database Daemon are generated and installed in $(EAR_ETC)/systemd. You have to copy those unit service files to your systemd operating system folder and then use the systemctl command to run the daemons. Check the EARD, EARDBD, EARGMD pages to find the precise execution commands.

When using systemctl commands, you can check messages reported to stderr using journalctl. For instance: journalctl -u eard -f. Note that if NodeUseLog is set to 1 in ear.conf, the messages will not be printed to stderr but to $EAR_TMP/eard.log instead. DBDaemonUseLog and GlobalmanagerUseLog options in ear.conf specifies the output for EARDBD and EARGM, respectivelly.

Additionally, services can be started, stopped or reloaded on parallel using parallel commands such as pdsh. As an example: sudo pdsh -w nodelist systemctl start eard.

Updating EAR with a new installation

In some cases, it might be a good idea to create a new install instead of updating your current one, like trying new configurations or when a big update is released.

The steps to do so are:

  • Install EAR in the new folder
  • Replicate old etc (including ear.conf and coefficients) in the new one and update ear.conf with the new ETC path and whatever changes may be needed.
  • Update EAR services in /etc/systemd/system folder (or equivalent, depending on your OS). Service files include ETC path and the absolute path for binaries.
  • Update /etc/slurm/plugstag.conf with the new paths.
  • Create a new EAR module with the updated paths.

Once all that is done, one should have two complete EAR installs that can be switched by changing the binaries that are executed by the services and changing the path in plugstag.conf.

Next steps

For a better overview of the installation process, return to the installation guide. To continue the installation, visit the configuration page to set up properly the EAR configuration file and the EAR SLURM plugin stack file.

Clone this wiki locally