-
Notifications
You must be signed in to change notification settings - Fork 1
EDCMON
The Energy Data Center Monitor is a new EAR service for Data Center monitoring. In particular, it targets elements different than computational nodes which are already monitored by the EARD running in compute nodes. However, whereas the EARDs monitor (among others) DC node power, the EDCMON service targets (eventhough it's not limited to) AC power. Because of that reason, the EDCMON main goal is to include all the power consumer components in a Data Center (Compute nodes, Network, Storage, Management).
EDCMON is 100% configurable and extensible since it uses an EAR framework named Plugin Manager which allows to load as many plugins as needed, which specific frequencies , dependencies among them and to share data between them. These plugins can communicate with each other through a tag (naming) system. The tag is a free text specified in the plugin code and is used as reference to specify dependencies, data sharing etc.
EDCMON parameters are:
Usage: ./edcmon [OPTIONS]
Options:
--plugins List of comma separated plugins to load.
--paths List of comma separated priority paths to search plugins.
--verbose Show how the things are going internally.
--silence Hide messages returned by plugins.
--monitor Period at which the plugin wake ups for monitoring. Def=100 ms
--relax Period to be used during low monitoring periods. Def=100 ms
--help If you see it you already typed --help.
This is an example of the executable arguments and its format:
edcmon --monitor=1000 --relax=1000 --plugins=nodesensors.so:30000+nodesensors_report.so:30000:nodesensor_log+nodesensors_alerts.so:30000:log --verbose
This example shows the default configuration used by EAR when the edcmon service is deployed. This case configures a monitoring period of 1 second and it loads three plugins (separated by character +):
- nodesensors: monitoring plugin based on Lenovo Confluent software. Reads specified sensors every 30 seconds. Exposes "nodesensors" tag.
- nodesensors_report: a reporting plugin for nodesensors plugin. Depends on "nodesensors" tag and uses its data. It is executed every 30 secs. The "nodesensor_log" is a parameter indicating the report plugin to use. Exposes "nodesensors_report" tag.
- nodesensors_alerts: An alerting plugin depending on "nodesensors" tag and using the data produced by it. Executed every 30 secs. Exposes the "nodesensors_alerts" tag. The argument "log" indicates the approach to report alerts.
Plugins are installed in $EAR_INSTALL_PATH/lib/plugins/monitoring folder
./edcemon --plugins=metrics.so:2000+periodic_metrics.so:4000 --paths=path/to/plugins1:path/to/plugins2
The list of plugins to load contains also their calling time in milliseconds. Its main periodic action (PA) function will be called once that time has passed. But that variable is not mandatory, because some plugins may not have defined a PA function and only act as receiver of other plugin data. In that case these receiving functions will be called once the shared data of other plugin is ready. Or maybe you don't want Plugin Manager to call your PA function in that moment.
Also, additional colons can be provided to pass information to a plugin during its initialization:
./edcmon --plugins=metrics.so:2000+periodic_metrics.so:4000:config_message1:config_message2 --paths=path/to/plugins1:path/to/plugins2
You can send N configuration messages to your plugin initialization function which will alter its behaviour. You can avoid the time variable or write 0 instead:
./edcmon --plugins=metrics.so:2000+periodic_metrics.so:0000:config_message1:config_message2 --paths=path/to/plugins1:path/to/plugins2
./edcmon --plugins=metrics.so:2000+periodic_metrics.so:config_message1:config_message2 --paths=path/to/plugins1:path/to/plugins2
Caution!! If the config_message1 is a number, the Plugin Manager can understand that you are passing a time value instead an argument. You can set the calling time to 0 to avoid that misunderstanding.
Plugins also have dependencies. It means that a plugin may depend on the actions or data shared by other plugins. A dependency is written in a string in the compiled binary itself, so you don't have to load it manually. It will be loaded automatically and its calling time could be the dependent plugin time (if specified in the binary). But if you want to set a specific calling time you have to load it manually and set the time you want. If a dependency is hard (which is specified in the string), a failure in the required plugin will disable the dependent plugin.
Plugins also have priorities. If a plugin A is a dependency of plugin B, the plugin A will be called before. If a Plugin B was written before plugin A in the --plugins parameter, A will be called before, because these cases are contemplated in the dependency system algorithmics.
Even though in the plugins folder there are other plugins available (listed at the end of this page), these are the plugins specifics for Data center monitoring.
| Plugin | Information |
|---|---|
| nodesensors | Reads confluent power sensors |
| nodesensors_report | Reports power readings explosed by nodesesors |
| nodesensors_alter | Checks limits and executes actions based on nodesensors |
As previously said, the plugin periodic functions have to have concrete name. These functions names and arguments are the following:
void up_get_tag (cchar **tag, cchar **tags_deps)
char *up_action_init (cchar *tag, void **data_alloc, void *data)
char *up_action_periodic (cchar *tag, void *data)
char *up_post_data (cchar *msg, void *data)The function up_get_tag is in charge of returning the plugin's own tag and its dependency plugins tags.
A tag must match the name of the shared object file without the extension, i.e., a plugin named my_plug.so has the tag my_plug.
Thus, dependency tags allows the Plugin Manager to search and open the tagged plugins automatically.
The format is a tag list separated by plus + signs.
For example:
void up_get_tag(cchar **tag, cchar **tags_deps)
{
*tag = "some_test";
*tags_deps = "dependency1+!dependency2";
}You can prepend some symbols to each dependency tag, which are used to set additional information about the dependency. This is the current set of dependency modifier symbols:
| Symbol | Description |
|---|---|
| ! | The dependency is mandatory. |
| < | Inherit the timing of the dependant plugin. |
In the example above dependency2 is set with the exclamation mark "!" prefixed, which means that the dependency is mandatory for the loading plugin, and in case it is not resolved the loading plugin will be disabled.
The function up_action_periodic() or PA is the core function to perform actions and share data. It receives a tag and a pointer to the data associated with that tag. The received tag could be the self tag or the tag of other plugins. The plugin PA function will be called with its own tag and data when the specified time in --plugins argument has passed, or with other plugin tag and data after that plugin has called its own PA function with its own tag.
Examples of PA function types:
char *up_action_periodic(cchar *tag, void *data)
{
if (is_tag("tag2")) {
type2_t *d = (type2_t *) data;
// work
}
return NULL;
}
char *up_action_periodic_tag1(cchar *tag, void *data)
{
type1_t *d = (type1_t *) data;
// work
return NULL;
}As you can see, you can define a generic up_action_periodic() function or one with a suffixed tag.
A suffixed function will be called only when a plugin's tag matches the function tag suffix.
If you define just a generic version of the function, take into the account that you have to distinguish between tags. The macro is_tag will help you to do this and maintain your code clean and verbose.
The returning char is a message that Plugin Manager will print in case is not NULL. You can add some modifiers at the beginning of the message:
-
[D]disables the plugin. It also re-activates the dependency system and could disable dependant plugins. -
[=]pauses the periodic call. -
[X]closes the Plugin Manager main thread.
The up_action_init() function works the same, it can receive the own plugin tag o other plugin tag. It is called one time before calling any PA function and can be used to allocate and initialise data.
static mydata_t mydata;
char *up_action_periodic_mytag (cchar *tag, void **data_alloc, void *data)
{
*data_alloc = &mydata;
return "I have been initialized and mydata will be shared among the loaded plugins";
}
char *up_action_periodic_tag2 (cchar *tag, void **data_alloc, void *data)
{
tag2_type_t var = (tag2_type_t) data;
return "Now I know that tag2 plugin has been initialized";
}When an initialisation function is called and receives its own plugin tag, the data_alloc double pointer serves as pointer to the data that self plugin wants to share with other plugins, so it is responsible to allocate the data and set the address pointer. When an initialisation function is called and receives other plugin tag, the data_alloc variable is NULL and data parameter points to the shared data newly initialised by their own plugin, which is the same data referenced in the PA function.
The configuration string mentioned in EDCMON executable is also received when the initialisation function is called with own plugin tag using the data parameter, and can be retrieved as a list of arguments:
static mydata_t mydata;
char *up_action_periodic_mytag (cchar *tag, void **data_alloc, void *data)
{
char **args = (char **) data;
*data_alloc = &mydata;
if (args != NULL) {
if (strcmp(args[0], "i_want_to_say_hello") == 0) {
printf("Hello!\n");
}
}
return "I have been initialized and mydata will be shared among the loaded plugins";
}
The final pipeline is:
1) up_get_tag (all plugins)
2) up_action_init (all plugins)
3) up_action_periodic (all_plugins)
4) up_action_periodic (the plugins whose time has passed, and then the plugins which depends on their data)
PA example, if B and C depends on A in --plugins=A.so:4000+B.so:3000+C.so:4000
1) A up_action_periodic will be called with 'A' tag.
2) B up_action_periodic will be called with 'A' tag.
3) C up_action_periodic will be called with 'A' tag.
4) B up_action_periodic will be called with 'B' tag.
5) C up_action_periodic will be called with 'C' tag.
6) After 3 seconds, B up_action_periodic will be called with 'B' tag.
7) After 4 seconds, A up_action_periodic will be called with 'A' tag.
8) After 4 seconds, B up_action_periodic will be called with 'A' tag.
9) After 4 seconds, C up_action_periodic will be called with 'A' tag.
10) After 6 seconds, B up_action_periodic will be called with 'B' tag.
11) After 8 seconds, A up_action_periodic will be called with 'A' tag.
12) And so on...
Finally, up_post_data() allows to receive external data to the plugins. In example, if Plugin Manager is being used by the EARD, when a job starts you could send a message to the framework containing the job and step IDs. By now you can distinguish the messages by is_msg macro. Maybe in the near future we implement the suffix system too.
The following helper macros to define the functions and maintain your plugins updated in case of a change in some function. They can be found in plugin_manager.h:
#define declr_up_get_tag() void up_get_tag (cchar **tag, cchar **tags_deps)
#define declr_up_action_init(suffix) char * up_action_init##suffix (cchar *tag, void **data_alloc, void *data)
#define declr_up_action_periodic(suffix) char * up_action_periodic##suffix (cchar *tag, void *data)
#define declr_up_post_data() char * up_post_data (cchar *msg, void *data)
An example of the up_action_periodic():
Examples of action_periodic function types:
declr_up_action_periodic(_tag1)
{
type1_t *d = (type1_t *) data;
// work
return NULL;
}
declr_up_action_periodic()
{
if (is_tag("tag2")) {
type2_t *d = (type2_t *) data;
// work
}
return NULL;
}
By now these are the functions of the framework:
// Init as main binary function
int plugin_manager_main(int argc, char *argv[]);
// Init as a component of a binary
int plugin_manager_init(char *files, char *paths);
// Closes Plugin Manager main thread.
void plugin_manager_close();
// Wait until Plugin Manager exits.
void plugin_manager_wait();
// Asking for an action. Intended to be called from plugins.
void *plugin_manager_action(cchar *tag);
// Passing data to plugins. Intended to be called outside PM.
void plugin_mananger_post(cchar *msg, void *data);
The plugin_manager_main() receives the program arguments (argc and argv), in which is included --plugins. You can also call plugin_manager_init() if you prefer the list of plugins and search paths separately (but in the same format). plugin_manager_action() calls the PA function of the plugin whose tag is referenced, it can be useful in some contexts when a plugin prefers to call a required plugin manually. Finally plugin_manager_wait() waits until the Plugin Manager main thread is closed.
| Plugin | Information |
|---|---|
| conf | Reads ear.conf and shares its data with other plugins. |
| dummy | Just an example. |
| eardcon | Connects with EARD. Saves other plugins to do that. |
| kernel_cl | An OpenCL kernel test. |
| kernel_cuda | A CUDA kernel test. |
| keyboard | A keyboard input. |
| management | Initializes all management APIs. |
| management_viewer | Views all management information. |
| metrics | Initializes and read all metrics APIs |
| metrics_viewer | Views all metrics readings. |
| periodic_metrics | Receives metrics and computes a periodic_metric. |
| test_cpufreq | A CPUFreq test. |
| test_gpu | Initializes and read the GPU API. |
- Can I load the same plugin twice? No.
- Is the tag mandatory value? Yes, all the plugins require a tag.
- And the dependency tags? Can be set to NULL if the plugin does not have any dependency.
-
Do I have to specify the time of a plugin in the dependency list? No, is not recommended. A plugin which is loaded by the dependency list instead using the
--pluginsparameter inherits the dependent plugin time if using the special character '<' at the beginning of the string. - If none of the dependencies are resolved, the plugin periodic function will be called anyways? Depends if some of the dependencies are mandatory, specified by the exclamation mark (!).
- What happens if a plugin has periodic time specified but haven't defined a periodic function? If there is no periodic function, nothing will be called.
-
Do I have to define all the API functions in the plugin? No, only those necessary for the correct plugin functionality. The
get_tagfunction is the exception because the tag is a mandatory value. -
Can
action_initfunction be defined butaction_periodicnot? Yes. Sometimes you want to perform an action just once and you can do it in the init function. In example, the job of the pluginconf.sois to readear.confand pass the configuration structure to the rest of loading plugins. - For a plugin which does not allocate data, is its periodic function called? Yes, if it's defined. But the NULL value in the allocated data pointer disallows any information exchange, so periodic function of other plugins wont be called.
-
If a plugin has defined the a function
action_periodic_tagXfor the tagtagX, but also the generalaction_periodic, which of the two would be called? If defined a suffixed function, that tagged version will be called. For the rest of the tags the generalaction_periodic.
- Home
- User guide
- Tutorials
- Commands
- Environment variables
- Admin Guide
- Installation from source
- Architecture/Services
- High Availability support
- Configuration
- Classification strategies
- Learning phase
- Plug-ins
- Powercap
- Report plug-ins
- Database
- Supported systems
- EAR Data Center Monitoring
- CHANGELOG
- FAQs
- Known issues