-
Notifications
You must be signed in to change notification settings - Fork 6
Version 2
The HappyFace Project (HF) and thus this documentation is work in progress. We try to keep the documentation always up to date, however if you still find any erros or inconsistencies or if you have suggestions for improvements, feel free to write an email to the HappyFace development list (happyface-dev@…).
The HappyFace Project is a meta-monitoring system. It was initially introduced to improve the regular monitoring of Grid sites of the Worldwide LHC Computing Grid (WLCG) but can also be used for any (complex) computing system, e.g. local or distributed computing resources.
The main goal of HappyFace is to provide fast access to all relevant monitoring information within "three mouse clicks" and to get a complete overview of the status of all available sub-systems of a computing site on one page.
All relevant monitoring information for a site is queried automatically by the HappyFace core system. It provides an adaptable configuration which allows specific rating of the data and, due to its modular framework, written in Python, a lot of testing modules can be easily plugged in. The development of own modules is also possible and explicitely foreseen by design of HappyFace. All information gathered by HappyFace is visualized on a single website.
The following main features significantly help in administrating computing sites:
- HappyFace framework is written in Python.
- A modular design allows to plug in and configure individual test modules.
- Creation of own modules is explicitely forseen by design.
- Easy rating mechanism available to classify the obtained information.
- Visualization on one webpage with a simple and transparent "traffic light" logic.
- Modules can be combined to categories with e.g. weighted overall status calculation.
- History functionality, all information is stored in a database and can be accessed at any time.
- Possibility to create overview plots for a specified period in time.
- XML output for further usage of the category and module status information.
- Status bar plugin available: Firefox
HappyFace has the following requirements to the host system:
- Python (version 2.4 or newer)
- SQLite (database backend)
- Webserver with PHP support (for the visualization)
- Webserver certificate and enabled ssl/https if access control is required
- PHP (version 5.2 or newer)
- Cronjob support (for the HappyFace execution)
In some environments it could be necessary to run the execution of HF and the webserver visualization on different hosts. In such a case it is only required that the database backend and the HF output directory are accessible from both hosts.
Here are two preparation examples for the SLC/RedHat and Ubuntu/Debian OS:
CentOS 5.4 / RHEL 5.4 / Scientific Linux 5.4 (Python 2.4.3)
- install httpd (webserver)
- install sqlite (database backend)
- install subversion (for the code mananement)
- for php support install: php, php-devel, php-pear, php-pdo, php-gd
- add two php extension in /etc/php.ini: extension=pdo.so extension=pdo_sqlite.so
- install python-setuptools
- install SQLObject (for SQLite access via python): easy_install -U SQLObject
- update PHP to version 5.2 or newer, e.g.: wget http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-3.noarch.rpm wget http://rpms.famillecollet.com/enterprise/remi-release-5.rpm rpm –Uvh remi-release-5.rpm epel-release-5.rpm yum –enablerepo=remi update php
Debian 5.0 / Ubuntu 8.04 LTS or newer (Python 2.5.2)
- install apache2 (webserver)
- install sqlite3 (database backend)
- install subversion (for the code mananement)
- for php support install: php5, php5-dev, php-pear, php5-gd
- install the two php extensions pdo and pdo_sqlite
- In newer Debian and Ubuntu releases, you only have to install the package php5-sqlite. This satisfies all dependencies. Important Remark: Be aware that in such releases, installing PDO via pecl breaks compatibility and requires a clean purge of all PHP packages and a re-installation afterwards!
- If the php5-sqlite package is not available in your release or does not install the PDO driver then install it via pecl: sudo pecl install pdo sudo pecl install pdo_sqlite
- add the php extensions in /etc/php5-apache2/php.ini: extension=pdo.so extension=pdo_sqlite.so
- install python-sqlobject
Additional requirements (OS idependent):
Some available HF modules may need additional packages:
- python-lxml
- python-matplotlib
If the certificate based access control to the HF webpage is required, please refer to the corresponding section.
It is as well important to apply the correct timezone settings within PHP to prevent the creation of various warnings in your webserver logs. If the timezone is not yet set, add a line with e.g. the following content to your php.ini configuration file:
date.timezone = Europe/Berlin
For installing HF it is required to check out the source code from the HF subversion repository:
svn co https://ekptrac.physik.uni-karlsruhe.de/public/HappyFace/trunk myHFinstance
Make sure that the destination directory myHFinstance is accessible from the webserver and from the machine HF itself is executed.
The design of the HappyFace Project provides that a cronjob executes the main script run.py in regular time intervals. A valid interval is for example 15 minutes. As an example, the following line could be put in your crontab:
# min hour dom mon dow command
*/15 * * * * cd /path/to/myHFinstance/HappyFace && ./run.py >/dev/null 2>&1
A detailed introduction concerning the implementation of cronjobs can be found here: http://www.selflinux.org/selflinux/html/cron.html, http://de.wikipedia.org/wiki/Cronjob
For sure HF can as well be executed manually by just invoking run.py from within your myHFinstance/HappyFace directory during testing, debugging or development cycles.
A description of the web page and the HF user interface can be found here.
If you check out HF, not all of the files and directories described below are available immediately. Some of them are created during the first execution of the main HF script.
-
./examplefiles This directory contains some example xml or plot files for various modules which rely on external information for processing.
-
./externals This directory contains scripts, executables or additional information for modules which require an external xml or data provider.
-
./HappyFace? Here you find the complete source code of the HappyFace framework. This directory contains the following sub-directories:
-
./HappyFace?/happycore This directory contains all core classes of HF. Each child module inherits its functionality and configuration from its parent up to the final top level module class ModuleBase. All .py files contain the module source, .cfg files contain module specific configuration and .css files contain module specific settings for the website visualization.
-
./HappyFace?/local If configurations need to be adapted for a specific site, this directory needs to be created in order to host local config files or local self-developed modules. Further information about this directory is given in the next section.
-
./HappyFace?/modules This directory contains all available modules which can be activated in the HF configuration file. A good starting point for a deeper insight and how to individually set up HF is the get_started moule with its source (.py), config (.cfg) and style (.css) file.
-
./HappyFace?/output The subdirectory web contains all necessary files for the webserver visualization. In the future, other visualization frontends can be added. The scripts and routines placed here create the webpage output which can be found in ./webpage.
-
./HappyFace?/run.py This is the HF executable. It is the only script which needs to be run in order to process all configured modules, to store the results in the database and to create the final web visualization output.
-
./HappyFace?/run.cfg This file contains the main configuration parameters of HappyFace. Here all framework-specific settings like module execution, time limits, category setup or icon themes are stored. All module specific settings are stored inside the different module configuration files as described below.
-
./HappyFace?/access.cfg This file contains the configuration parameters for setting up specific access rights. It is possible to restrict access to single modules or complete categories based on browser certificates. For more details, refer to the section below.
-
./tmp This directory contains temporary data (xml input files, plots, etc.) required during a HF run. An automatic cleanup procedure removes all unnecessary files after a HF run has finisched successfully.
-
./tools This directory contains some helper scripts which for example aid in the maintenance of a HF instance or which allow to create a new skeleton for module development. More information about the tools can be found in the appropriate section at the end of this document.
-
./webpage This directory contains the results of HappyFace and all visualization specific files (final html output, style sheets, images). The most important files and directories are:
./webpage/archive - this directory contains all downloaded binary data (e.g. plot images) ./webpage/cache - this directory contains a cached xml output file. Make sure the webserver can write to it. ./webpage/HappyFace.db - SQLite database which contains all stored data of HF. ./webpage/index.php - this file displays HF in a browser window and accesses the HF database. This file is created during each HF run to ensure that the webpage always reflects the latest version and configuration of HF.
HappyFace requires two types of configurations:
- One is the configuration of the HF core system which is handled by the file ./HappyFace?/run.cfg.
- Each class or module in HF has it's own module specific configuration file ./HappyFace?/modules/module.cfg.
Due to the inheritance functionality of HF, configuration parameters valid for various modules, are always set in the parent classes' module file. This allows to change the behavior of a whole module group by just modifying this parent configuration file instead of changing each individual configuration file of each child. However, a parameter set in the parent configuration file can be overwritten individually inside any configuration file further down in the inheritance chain.
Since each HF instance differs from the version in the repository, the concept of .local configuration files was introduced. This is very helpful, since all default configuration parameters can be shipped with the central HF repository and all site or HF instance specific variations can be put in such a .local file. The management of the local configuration can be done with a second site intern repository which should manage the following directories:
./HappyFace?/local
./HappyFace?/local/modules
./HappyFace?/local/cfg
The directory ./HappyFace?/local/modules should contain modules (.py, .cfg, .css) which are exclusively run on the local site. It is also possible to use this structure for module development. As stable declared modules should at some point migrate to the official repository (--> ./HappyFace?/modules) in order to make these modules available in the central HF repository.
The direcotry ./HappyFace?/local/cfg should contain besides the file run.local (which overwrites the settings of the core configuration) also all core module specific .local files.
This separation of the global HappyFace source code and the local configuration eases development and maintenance of an instance significantly.
It is possible to restrict access rights to the HappyFace webpage based on browser certificates. The access rights are configured in
./HappyFace?/access.cfg
./HappyFace?/local/cfg/access.local
It is possible to restrict rights for single modules or whole categories. For more details, refer to the access.cfg configuration file. More information about access rights and their influence on the web interface can be found here.
Remark: Your webserver has to be set up to allow ssl/https connections on e.g. port 443. Otherwise browser certificats can't be read. Make sure that the required CAs for the used certificates are properly installed on your system. In addition, the ssl variables have to be made accessible by php. Find below an example setting for the apache2 webserver:
<VirtualHost my-hf-webserver.mysite.com:443>
SSLEngine on
SSLCertificateFile /etc/apache2/ssl/mywebserver.pem
SSLCertificateKeyFile /etc/apache2/ssl/mywebserver.key
SSLCertificateChainFile /etc/apache2/ssl/mycachainfile.pem
SSLCACertificatePath /etc/ssl/certs/
SSLVerifyDepth 10
SSLVerifyClient optional
<FilesMatch "\.(cgi|shtml|phtml|php)$">
SSLOptions +StdEnvVars
</FilesMatch>
<Directory /my/path/to/happyface/>
SSLVerifyClient optional
Options Indexes FollowSymLinks MultiViews
SSLOptions +StdEnvVars +ExportCertData
AllowOverride None
Order allow,deny
Allow from all
</Directory>
</VirtualHost>
The Framework:
- The HappyFace framework organises the test execution based on a given time interval.
- Each test is represented by a module, which can be easily plugged in.
- Each module can be activated / arranged in the global configuration.
- Different modules can be combined to different categories.
- The HF core and all main modules are available on a central subversion repository.
HF Modules:
- They can specify files to be donwloaded for evaluation.
- They process the gathered information.
- The output of each module is stored in the DataBase.
- Each module provides a php fragment for the final visualization on a web page.
Configuration
- The default configuration (which works out of the box) of the core system and some example modules can be found as .cfg file inside the SVN repository.
- Each configuration parameter can be adapted for local purposes by a .local file.
HappyFace: Available Modules
A list of all available HappyFace modules can be found here. HappyFace: Module Development Module Development: Basic Structure
Global modules are residing in the directory ./HappyFace?/modules. Local modules have to be put in ./HappyFace?/local/modules. Remark: The name of a module used by the framework has to be lowercased to prevent problems with the sqlite database. Each module is represented by a python class (.py), a config file (.cfg) and if necessary an additional style sheet file (.css) for the specific HTML/PHP output. You can find some example modules in the following directory: ./HappyFace?/modules/examples.
A good starting point to learn how to write own modules is for example the module get_started.py. Have a look at the source code to learn how the general setup of HF and its modules works.
Starting from that knowledge, we now begin to write our own module called quick_start. For that purpose, we create the following files in our local repository. If this module should go at some point in the general HF core repository, it is required to migrate the module (after it is tested to be stably working) to the directory ./HappyFace?/modules.
./HappyFace?/local/modules/quick_start.py
./HappyFace?/local/modules/quick_start.cfg
./HappyFace?/local/modules/quick_start.css
Let us begin with the configuration file quick_start.cfg:
############################
# quick_start.cfg
# five variables have to be set!!!:
[setup]
# name of the module in the output
mod_title = Quick Start Test Module
# the type of the module can be "rated", "unrated" and "plots"
# this is importand for the category status calculation algorithm
mod_type = rated
# some category algorithm include the weight of a module to calculate their status
weight = 1.0
# this string should describe the test logic
definition = Definition of the test logic
# here the shifter should find instructions what to do in a critical situation
instruction = What to do in case of troubles
############################
# of course one can create some further variables which are used inside the module
number_1 = 4
string_1 = free
The python class should have 3 basic functions (of course it is possible to implement more if necessary):
- init() The initialisation of the class, here you can read out the config file of the module and define variables.
- process() This function will be executed in a multi-thread environment and contains the complete test logic.
- output() After the execution of the test logic, the framework starts to collect the PHP fragments which will be produced by this function.
Our simple module with the name quick_start.py could for example look like this:
from ModuleBase import *
class quick_start(ModuleBase):
def __init__(self,module_options):
# inherits from the ModuleBase Class
ModuleBase.__init__(self,module_options)
# read additional config settings
self.string_1 = self.configService.get('setup','string_1')
self.number_1 = self.configService.get('setup','number_1')
# definition of the database keys and pre-defined values
# possible format: StringCol(), IntCol(), FloatCol(), ...
self.db_keys["message"] = StringCol()
self.db_values["message"] = ""
def process(self):
# run the "test"
message = "this software is " + str(self.number_1) + " " + self.string_1
# define module status 0.0..1.0 or -1 for error
self.status = 1.0 # always happy
# define the output value for the database
self.db_values["message"] = message
def output(self):
# create output string, all data stored in DB is available via a $data[key] call
module_content = """
<?php
printf('<h5><central>' . $data["message"] . '</central></h5><br />');
?>
"""
return self.PHPOutput(module_content)
This little "test module" inherits directly from the highest module class ./HappyFace?/happycore/ModuleBase.py. To prevent code doubling for similar test modules, it is advantageous to create specific module classes providing properties and functions used by more than one test module. Take a look at the currently available test modules and their corresponding classes lying in ./HappyFace?/happycore.
Each module stores a couple of variables to the database, there are some pre-defined general variables: module - This is the name of the module, in our case "quick_start".
- category - Each module must be associated with a category, look at the main config file ./HappyFace?/run.cfg for configured categories.
- status - Pre-defined with the value "-1". This stands for "no information / error" and should be changed by the test logic inside the process() function of the module to a float value betwenn 0.0 (critical situation) and 1.0 (everything is fine).
- timestamp - The framework creates with every call a global timestamp (rounded to minutes) for all modules.
- error_message - This variable is empty and can be filled with error information (e.g. download error, failed xml parsing, etc.) in case something goes wrong during module execution. The information will be highlighted on the output web page.
- datasource - This variable is used by the download service described further below.
- mod_title - Name of the module on the output web page.
- mod_type - The type of the module can be "rated", "unrated" and "plots", important for the module symbol and the category status algorithm.
- weight - Some category status calculation algorithms include the weight value of the modules to calculate their general status.
- definition - This string should describe the test logic for a user. It is displayed on the output web page.
- instruction => Here the shifter/user should find instructions on what to do in a critical situation when the module gives a warning or error.
To create further variables which should also be stored to the database, you have to create a key / value pair: self.db_keys["answer"] = IntCol() self.db_values["answer"] = 42
The output of each module is a php fragment which will be included inside the final ./webpage/index.php output file. The php fragment should print out a valid and self-contained piece of HTML code where the information in the database is available via a $data['key'] call (depending on the timestamp which can be change via an interface on the output website).
If the module encounters an error while executing the test it can simply throw an exception (for example if a download failed, or if the download data is not well-formed). The exception's message will be shown in the web interface as an error message and the module status will be set to -1.
Usually every module needs data which has to be analyzed or just summarized on the HappyFace output website. Before the processing can start, the data has to be downloaded/made available from the source. The HappyFace framework provides for that some unified functionality.
A simple download request can be configured in one of the .cfg module files (or higher class .cfg files). The download service then takes care that in different modules commonly used source files are only downloaded once to speed up the process and to save bandwidth. For the download service, a setup subsection needs to be created inside a config file: [downloadservice]. Currently the following download mechanism are implemented:
Download information via a wget command from e.g. a remote webserver (in this case a remote .xml file):
[downloadservice]
somedata_xml = wget|xml|options|http://path/to/source/content.xml
Get data from locally accessible files (in this case a local .png file):
[downloadservice]
somedata_local = local|png|options|/absolute/or/relative/path/to/source.png
The options field is currently only interpreted by the wget command and can be used to hand over individual command-line switches to be executed.
The CSS Service take care of copying module style sheet files to the web page output directory. It as well takes care to link them to the final index.php file and to clean up now unused, old and outdated .css files. The service is used automatically without any configuration if a file mymodule.css exists. Module Development: Subtables
In certain cases a HappyFace module needs to store a variable amount of data. In that case adding further variables to the main module table is not an appropriate solution. Instead a so-called subtable should be used instead which is a separate database table for the additional data of the module.
To do so we should first add a string column to the main table which contains the name of the subtable. It is convention for the column name to contain database, so that automatic scripts crawling the database can detect subtables correctly. The table name should be unique so we recommend that it is prefixed by self.module.
self.db_keys["subtable_database"] = StringCol()
self.db_values["subtable_database"] = self.__module__ + "_some_subtable"
The next step is to initialize the table. To specify the table structure a dictionary which maps from the column name to one of IntCol(), FloatCol() or StringCol() (specifying the type of the column) needs to be set up. Then self.table_init() is called with the table name and the directory of columns.
sub_keys = {}
sub_keys["number"] = IntCol()
subtable_class = self.table_init(self.db_values["subtable_database"], sub_keys)
Note that common columns will be added automatically for the subtable. This includes a unique ID primary key and a timestamp column that is automatically synchronized with the main table timestamp.
To fill the table with data we can use self.table_fill with a dictionary mapping from column name to a value of the correct type. Alternatively self.table_fill_many can be used which takes a list of dictionaries to insert many rows at once. This brings a significant performance benefit when adding many rows because the database will not be synced to disk for each row individually.
sub_value_list = []
for i in range(0,10):
sub_value_list.append({'number': i})
self.table_fill_many(subtable_class, sub_value_list)
Finally we need to make sure the subtable is cleaned up if a holdback time is specified in the global HappyFace configuration. To do so, a single call to self.table_clear() is enough.
self.table_clear(subtable_class, [], self.holdback_time)
The second argument to self.table_clear() is a list of column names in the table which specify filenames of downloaded plots or other files that need to be removed from the archive when the corresponding row in the database is cleared. Normally subtables do not reference archive files so this can stay empty most of the time.
To access the subtable in the PHP output of the module the global object $dbh can be used to query the database for the subtable. Here is an example:
$query = 'SELECT * FROM ' . $data['subtable_database'] . ' WHERE timestamp = ' . $data['timestamp'];
foreach ($dbh->query($query) as $info)
print($info['number'] + '<br />');
A complete example module which uses a subtable is available in HappyFace/modules/examples/subtable_example.py.
Currently the following external plugins are available:
HF Firefox Plugin
The ./tools directory contains various helper scripts to aid in the administration and maintenance of a HappyFace instance. A detailed description of these scripts can be found here.
- Introductory Slides: The HappyFace Project
- CMS Offline Week, April 8, 2011: HappyFace for CMS Tier-1 local job monitoring
- CMS FacOps meeting, Sep 20, 2010: "HappyFace" view of uniform CMS local job monitoring
- CMS Offline Week, Sep 29, 2010: CMS Tier-1 Local Job Monitoring Project
- Review of HF and Some Selected Modules
- DPG Fruehjahrstagung 28. Maerz - 1. April 2011: Das HappyFace Meta-Monitoring Framework
- CHEP 2009 Conference Proceedings: Site Specific Monitoring of Multiple Information Systems
- CHEP 2010 Conference Proceedings: The HappyFace Project
- CHEP 2010 Poster
- CHEP 2010 conference report: CMS Distributed Computing Integration in the LHC sustained operations era
- Bachelor Thesis of Georg Jahn (2010): Meta-Monitoring of the Grid Resource Centre GoeGrid with HappyFace
If you discover any problems with the execution of the HF core, please:
have a look at the list of known problems.
contact the HappyFace development list (happyface-dev@…).
Module specific problems should be reported to the corresponding module developer. You can find proper contact information for each module here.
