A hook is an executable file that Shell-operator runs when some event occurs. It can be a script or a compiled program written in any programming language. For illustrative purposes, we will use bash scripts. An example with a hook in the form of a Python script is available here: 002-startup-python.
The hook receives the data and returns the result via files. Paths to files are passed to the hook via environment variables.
At startup Shell-operator initializes the hooks:
- The recursive search for hook files is performed in the hooks directory. You can specify it with
--hooks-dircommand-line argument or with theSHELL_OPERATOR_HOOKS_DIRenvironment variable (the default path is/hooks).- Every executable file found in the path is considered a hook.
- Found hooks are sorted alphabetically according to the directories’ and hooks’ names. Then they are executed with the
--configflag to get bindings to events in YAML or JSON format. - If hook's configuration is successful, the working queue named "main" is filled with
onStartuphooks. - Then, the "main" queue is filled with
kuberneteshooks withSynchronizationbinding context type, so that each hook receives all existing objects described in hook's configuration. - After executing
kuberneteshook withSynchronizationbinding context, Shell-operator starts a monitor of Kubernetes events according to configuredkubernetesbinding.- Each monitor stores a snapshot — a refreshable list of all Kubernetes objects that match a binding definition.
Next, the main cycle is started:
-
Event handler adds hooks to the named queues on events:
kuberneteshooks are added to the queue when desired WatchEvent is received from Kubernetes,schedulehooks are added according to the schedule,kubernetesandschedulehooks are added to the "main" queue or the named queue ifqueuefield was specified.
-
Each named queue has its queue handler which executes hooks strictly sequentially. If hook fails with an error (non-zero exit code), Shell-operator restarts it (every 5 seconds) until it succeeds. In case of an erroneous execution of a hook, when other events occur, a queue will be filled with new tasks, but their execution will be blocked until the failing hook succeeds.
- You can change this behavior for a specific hook by adding
allowFailure: trueto the binding configuration (not available foronStartuphooks).
- You can change this behavior for a specific hook by adding
-
Each hook is executed with a binding context, that describes an already occurred event:
kuberneteshook receivesEventbinding context with an object related to the event.schedulehook receives a name of triggered schedule binding.
-
If there is a sequence of hook executions in a queue, then hook is executed once with array of binding contexts.
- If binding contains
groupkey, then a sequence of binding context with similargroupkey is compacted into one binding context.
- If binding contains
-
Several metrics are available for monitoring the activity of the queues and hooks: queues size, number of execution errors for specific hooks, etc. See METRICS for more details.
Shell-operator runs the hook with the --config flag. In response, the hook should print its event binding configuration to stdout. The response can be in YAML format:
configVersion: v1
onStartup: ORDER,
schedule:
- {SCHEDULE_PARAMETERS}
- {SCHEDULE_PARAMETERS}
kubernetes:
- {KUBERNETES_PARAMETERS}
- {KUBERNETES_PARAMETERS}or in JSON format:
{
"configVersion": "v1",
"onStartup": STARTUP_ORDER,
"schedule": [
{SCHEDULE_PARAMETERS},
{SCHEDULE_PARAMETERS}
],
"kubernetes": [
{KUBERNETES_PARAMETERS},
{KUBERNETES_PARAMETERS}
]
}configVersion field specifies a version of configuration schema. The latest schema version is v1 and it is described below.
Event binding is an event type ("onStartup", "schedule" or "kubernetes") plus parameters required for a subscription.
Use this binding type to execute a hook at the Shell-operator’ startup.
Syntax:
configVersion: v1
onStartup: ORDERParameters:
ORDER — an integer value that specifies an execution order. When added to the "main" queue, the hooks will be sorted by this value and then alphabetically by file name.
Scheduled execution. You can bind a hook to any number of schedules.
Syntax:
configVersion: v1
schedule:
- crontab: "*/5 * * * *"
allowFailure: true|false
- name: "Every 20 minutes"
crontab: "*/20 * * * *"
allowFailure: true|false
- name: "every 10 seconds",
crontab: "*/10 * * * * *"
allowFailure: true|false
queue: "every-ten"
includeSnapshotsFrom: "monitor-pods"
- name: "every minute"
crontab: "* * * * *"
allowFailure: true|false
group: "pods"
...Parameters:
-
name— is an optional identifier. It is used to distinguish between multiple schedules during runtime. For more information see binding context. -
crontab– is a mandatory schedule with a regular crontab syntax with 5 fields. 6 fields style crontab also supported, for more information see documentation on robfig/cron.v2 library. -
allowFailure— if ‘true’, Shell-operator skips the hook execution errors. If ‘false’ or the parameter is not set, the hook is restarted after a 5 seconds delay in case of an error. -
queue— a name of a separate queue. It can be used to execute long-running hooks in parallel with other hooks. -
includeSnapshotsFrom— a list of names ofkubernetesbindings. When specified, all monitored objects will be added to the binding context in asnapshotsfield. -
group— a key that define a group ofscheduleandkubernetesbindings. See grouping.
Run a hook on a Kubernetes object changes.
Syntax:
configVersion: v1
kubernetes:
- name: "Monitor pods in cache tier"
apiVersion: v1
kind: Pod # required
executeHookOnEvent: [ "Added", "Modified", "Deleted" ]
executeHookOnSynchronization: true|false # default is true
fullObjectInSnapshot: true|false # default is true
nameSelector:
matchNames:
- pod-0
- pod-1
labelSelector:
matchLabels:
myLabel: myLabelValue
someKey: someValue
matchExpressions:
- key: "tier"
operator: "In"
values: ["cache"]
# - ...
fieldSelector:
matchExpressions:
- field: "status.phase"
operator: "Equals"
value: "Pending"
# - ...
namespace:
nameSelector:
matchNames: ["somenamespace", "proj-production", "proj-stage"]
labelSelector:
matchLabels:
myLabel: "myLabelValue"
someKey: "someValue"
matchExpressions:
- key: "env"
operator: "In"
values: ["production"]
# - ...
jqFilter: ".metadata.labels"
includeSnapshotsFrom:
- "Monitor pods in cache tier"
- "monitor Pods"
- ...
allowFailure: true|false # default is false
queue: "cache-pods"
group: "pods"
- name: "monitor Pods"
kind: "pod"
# ...Parameters:
-
nameis an optional identifier. It is used to distinguish different bindings during runtime. See also binding context. -
apiVersionis an optional group and version of object API. For example, it isv1for core objects (Pod, etc.),rbac.authorization.k8s.io/v1beta1for ClusterRole andmonitoring.coreos.com/v1for prometheus-operator. -
kindis the type of a monitored Kubernetes resource. This field is required. CRDs are supported, but the resource should be registered in the cluster before Shell-operator starts. This can be checked withkubectl api-resourcescommand. You can specify a case-insensitive name, kind or short name in this field. For example, to monitor a DaemonSet these forms are valid:"kind": "DaemonSet" "kind": "Daemonset" "kind": "daemonsets" "kind": "DaemonSets" "kind": "ds" -
executeHookOnEvent— the list of events which led to a hook's execution. By default, all events are used to execute a hook: "Added", "Modified" and "Deleted". Docs: Using API WatchEvent. Empty array can be used to prevent hook execution, it is useful when binding is used only to define a snapshot. -
executeHookOnSynchronization— iffalse, Shell-operator skips the hook execution with Synchronization binding context. See binding context. -
nameSelector— selector of objects by their name. If this selector is not set, then all objects of a specified Kind are monitored. -
labelSelector— standard selector of objects by labels (examples of use). If the selector is not set, then all objects of a specified kind are monitored. -
fieldSelector— selector of objects by their fields, works like--field-selector=''flag ofkubectl. Supported operators are Equals (or=,==) and NotEquals (or!=) and all expressions are combined with AND. Also, note that fieldSelector with 'metadata.name' the field is mutually exclusive with nameSelector. There are limits on fields, see Note. -
namespace— filters to choose namespaces. If omitted, events from all namespaces will be monitored. -
namespace.nameSelector— this filter can be used to monitor events from objects in a particular list of namespaces. -
namespace.labelSelector— this filter works likelabelSelectorbut for namespaces and Shell-operator dynamically subscribes to events from matched namespaces. -
jqFilter— an optional parameter that specifies event filtering using jq syntax. The hook will be triggered on the "Modified" event only if the filter result is changed after the last event. See example 102-monitor-namespaces. -
allowFailure— iftrue, Shell-operator skips the hook execution errors. Iffalseor the parameter is not set, the hook is restarted after a 5 seconds delay in case of an error. -
queue— a name of a separate queue. It can be used to execute long-running hooks in parallel with hooks in the "main" queue. -
includeSnapshotsFrom— an array of names ofkubernetesbindings in a hook. When specified, a list of monitored objects from that bindings will be added to the binding context in asnapshotsfield. Self-include is also possible. -
fullObjectInSnapshot— if not set ortrue, dumps of Kubernetes resources are cached for this binding and the snapshot includes them asobjectfields. Set tofalseif the hook not relies on full objects to reduce the memory footprint. -
group— a key that define a group ofscheduleandkubernetesbindings. See grouping.
Example:
configVersion: v1
kubernetes:
# Trigger on labels changes of Pods with myLabel:myLabelValue in any namespace
- name: "label-changes-of-mylabel-pods"
kind: pod
executeHookOnEvent: ["Modified"]
labelSelector:
matchLabels:
myLabel: "myLabelValue"
namespace:
nameSelector: ["default"]
jqFilter: .metadata.labels
allowFailure: true
includeSnapshotsFrom: ["label-changes-of-mylabel-pods"]This hook configuration will execute hook on each change in labels of pods labeled with myLabel=myLabelValue in "default" namespace. The binding context will contain all pods with myLabel=myLabelValue from "default" namespace.
Unlike kubectl you should explicitly define namespace.nameSelector to monitor events from default namespace.
namespace:
nameSelector: ["default"]Shell-operator requires a ServiceAccount with the appropriate RBAC permissions. See examples with RBAC: monitor-pods and monitor-namespaces.
This filter is used to ignore superfluous "Modified" events, and to exclude object from event subscription. For example, if the hook should track changes of object's labels, jqFilter: ".metadata.labels" can be used to ignore changes in other properties (.status,.metadata.annotations, etc.).
The result of applying the filter to the event's object is passed to the hook in a filterResult field of a binding context.
You can use JQ_LIBRARY_PATH environment variable to set a path with jq modules. Also, Shell-operator uses jq release 1.6 so you can check your filters with a binary of that version.
Consider that the "Added" event is not always equal to "Object created" if labelSelector, fieldSelector or namespace.labelSelector is specified in the binding. If objects and/or namespace are updated in Kubernetes, the binding may suddenly start matching them, with the "Added" event. The same with "Deleted" event: "Deleted" is not always equal to "Object removed", the object can just move out of a scope of selectors.
There is no support for filtering by arbitrary field neither for core resources nor for custom resources (see issue#53459). Only metadata.name and metadata.namespace fields are commonly supported.
However fieldSelector can be useful for some resources with extended set of supported fields:
| kind | fieldSelector | src url |
|---|---|---|
| Pod | spec.nodeName spec.restartPolicy spec.schedulerName spec.serviceAccountName status.phase status.podIP status.nominatedNodeName |
1.16 |
| Event | involvedObject.kind involvedObject.namespace involvedObject.name involvedObject.uid involvedObject.apiVersion involvedObject.resourceVersion involvedObject.fieldPath reason source type |
1.16 |
| Secret | type | 1.16 |
| Namespace | status.phase | 1.16 |
| ReplicaSet | status.replicas | 1.16 |
| Job | status.successful | 1.16 |
| Node | spec.unschedulable | 1.16 |
Example of selecting Pods by 'Running' phase:
kind: Pod
fieldSelector:
matchExpressions:
- field: "status.phase"
operator: Equals
value: RunningObjects should match all expressions defined in fieldSelector and labelSelector, so, for example, multiple fieldSelector expressions with metadata.name field and different values will not match any object.
When an event associated with a hook is triggered, Shell-operator executes the hook without arguments. The information about the event that led to the hook execution is called the binding context and is written in JSON format to a temporary file. The path to this file is available to hook via environment variable BINDING_CONTEXT_PATH.
Temporary files have unique names to prevent collisions between queues and are deleted after the hook run.
Binging context is a JSON-array of structures with the following fields:
binding— a string from thenameorgroupparameters. If these parameters has not been set in the binding configuration, then strings "schedule" or "kubernetes" are used. For a hook executed at startup, this value is always "onStartup".type— "Schedule" forschedulebindings. "Synchronization" or "Event" forkubernetesbindings. "Synchronization" or "Group" ifgroupis defined.
The hook receives "Event"-type binding context on Kubernetes event and it contains more fields:
watchEvent— the possible value is one of the values you can use withexecuteHookOnEventparameter: "Added", "Modified" or "Deleted".object— a JSON dump of the full object related to the event. It contains an exact copy of the corresponding field in WatchEvent response, so it's the object state at the moment of the event (not at the moment of the hook execution).filterResult— the result ofjqexecution with specifiedjqFilteron the above mentioned object. IfjqFilteris not specified, thenfilterResultis omitted.
The hook receives existed objects on startup for each binding with "Synchronization"-type binding context:
objects— a list of existing objects that match selectors in binding configuration. Each item of this list containsobjectandfilterResultfields. If the list is empty, the value ofobjectsis an empty array.
If group or includeSnapshotsFrom are defined, the hook receives binding context with additional field:
snapshots— a map that contains a list of objects for each binding name fromincludeSnapshotsFromor for eachkubernetesbinding in a group. IfincludeSnapshotsFromlist is empty, the field is omitted.
Hook with this configuration:
configVersion: v1
onStartup: 1will be executed with this binding context at startup:
[{"binding": "onStartup"}]For example, if you have the following configuration in a hook:
configVersion: v1
schedule:
- name: incremental
crontab: "0 2 */3 * * *"
allowFailure: truethen at 12:02, it will be executed with the following binding context:
[{ "binding": "incremental", "type":"Schedule"}]A hook can monitor Pods in all namespaces with this simple configuration:
configVersion: v1
kubernetes:
- kind: PodDuring startup, the hook receives all existing objects with "Synchronization"-type binding context:
[
{
"binding": "kubernetes",
"type": "Synchronization",
"objects": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
}
},
{
"object": {
"kind": "Pod",
"metadata":{
"name":"kube-proxy-...",
"namespace":"kube-system",
...
},
}
},
...
]
}
]If pod pod-321d12 is then added into namespace 'default', then the hook will be executed with the "Event"-type binding context:
[
{
"binding": "kubernetes",
"type": "Event",
"watchEvent": "Added",
"object": {
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"name": "pod-321d12",
"namespace": "default",
...
},
"spec": {
...
},
...
}
}
]Shell-operator caches a list of resources for each kubernetes binding. Another bindings can access this list via includeSnapshotsFrom parameter. Also, there is a group parameter to automatically get all snapshots from multiple bindings and deduplicate executions.
Snapshot is a list of cached kubernetes objects and corresponding jqFilter results. To access the snapshot from particular binding, there is a map snapshots in the binding context where the key is a binding name and the value is the snapshot.
snapshots format:
"snapshots": {
"binding-name-1": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
]
}object— it is a JSON dump of Kubernetes object.filterResult— a JSON result of applyingjqFilterto the Kubernetes object.
Keeping dumps for object fields can take a lot of memory. There is a parameter keepFullObjectsInMemory: false to disable full dumps.
Note that disabling full objects make sense only if jqFilter is defined, as it disables full objects in snapshots field, objects field of "Synchronization" binding context and object field of "Event" binding context.
For example, this binding configuration will execute hook with empty items in objects field of "Synchronization" binding context:
kubernetes:
- name: pods
kinds: Pod
keepFullObjectsInMemory: false
To illustrate includeSnapshotsFrom parameter, consider the hook that monitors changes of labels of all Pods and do something interesting on schedule:
configVersion: v1
schedule:
- name: incremental
crontab: "0 2 */3 * * *"
includeSnapshotsFrom: ["monitor-pods"]
kubernetes:
- name: monitor-pods
kind: Pod
jqFilter: '.metadata.labels'
includeSnapshotsFrom: ["monitor-pods"]During startup, the hook will be executed with the "Synchronization" binding context with snapshots JSON object:
[
{
"binding": "kubernetes",
"type": "Synchronization",
"objects": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
"labels": { ... },
...
},
},
"filterResult": {
"label1": "value",
...
}
},
{
"object": {
"kind": "Pod",
"metadata":{
"name":"kube-proxy-...",
"namespace":"kube-system",
...
},
},
"filterResult": {
"label1": "value",
...
}
},
...
],
"snapshots": {
"monitor-pods": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
]
}
}
]If pod pod-321d12 is then added into the "default" namespace, then the hook will be executed with the "Event" binding context with object and filterResult fields:
[
{
"binding": "kubernetes",
"type": "Event",
"watchEvent": "Added",
"object": {
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"name": "pod-321d12",
"namespace": "default",
...
},
"spec": {
...
},
...
},
"filterResult": { ... },
"snapshots": {
"monitor-pods": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
]
}
}
]at 12:02, the hook will be executed with the following binding context:
[
{
"binding": "incremental",
"type": "Schedule",
"snapshots": {
"monitor-pods": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
]
}
}
]group parameter defines a named group of bindings. Group is used when the source of event is not important and data in snapshots is enough for the hook. When binding with group is triggered with the event, the hook receives snapshots from all bindings with equal group name. Also, adjacent tasks with equal group in the same queue are "compacted" and hook is executed only once. So it is wise to use the same queue for all hooks in a group.
executeHookOnSynchronization, executeHookOnEvent and keepFullObjectsInMemory can be used with group.
group parameter is compatible with includeSnapshotsFrom parameter. includeSnapshotsFrom can be used to include additional snapshots into binding context.
Binding context for group contains:
bindingfield with group name.typefield with "Synchronization" or "Group" string.snapshotsfield if there is at least onekubernetesbinding in the group and inincludeSnapshotsFrom.
Consider the hook that is executed on changes of labels of all Pods, changes in ConfigMap and also on schedule:
configVersion: v1
schedule:
- name: incremental
crontab: "* * * * *"
group: "pods"
kubernetes:
- name: monitor_pods
apiVersion: v1
kind: Pod
jqFilter: '.metadata.labels'
group: "pods"
- name: monitor_configmap
apiVersion: v1
kind: ConfigMap
jqFilter: '.data'
group: "pods" During startup, the hook will be executed with the "Synchronization" binding context with snapshots JSON object:
[
{
"binding": "pods",
"type": "Synchronization",
"snapshots": {
"monitor_pods": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
],
"monitor_configmap": [
{
"object": {
"kind": "ConfigMap",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
]
}
}
]If pod pod-dfbd12 is then added into the "default" namespace, then the hook will be executed with the "Group" binding context:
[
{
"binding": "pods",
"type": "Group",
"snapshots": {
"monitor_pods": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
],
"monitor_configmap": [
{
"object": {
"kind": "ConfigMap",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
]
}
}
]Every minute it will be executed with the same binding context with fresh snapshots:
[
{
"binding": "pods",
"type": "Group",
"snapshots": {
"monitor_pods": [
...
],
"monitor_configmaps": [
...
]
}
}
]