Notifications
Introduction to Notifications
When a ClusterProfile is instantiated using Sveltos, it automatically watches for clusters that match the ClusterProfile clusterSelector field. When a match is found, Sveltos deploys all of the referenced add-ons, such as helm charts or Kubernetes resources.
Once the necessary add-ons are deployed, there might be a need to perform other operations on the cluster, such as running a CI/CD pipeline. However, it is important to ensure that the cluster is healthy, i.e., all necessary add-ons are deployed, before proceeding. Sveltos can be configured to assess the cluster health and send notifications if any changes are detected.
The notifications can be used by other tools to perform additional actions or trigger workflows. Sveltos will ensure the necessary Kubernetes add-ons are deployed and managed while ensuring the health and stability of the clusters.
ClusterHealthCheck
ClusterHealthCheck is the CRD that can be used to:
- Define the cluster health checks;
- Instruct Sveltos when and how to send notifications
Cluster Selection
The clusterSelector
field is a Kubernetes label selector. Sveltos uses it to detect all the clusters to assess health and send out notifications.
LivenessChecks
The livenessCheck
field is a list of cluster liveness checks to be evaluated.
The supported types are:
- Addons: Addons type instructs Sveltos to evaluate state of add-ond deployment in such a cluster;
- HealthCheck: HealthCheck type allows to define a custom health check for any Kubernetes type.
Notifications
The notifications fields is a list of all notifications to be sent when the liveness check state changes.
The supported types are:
- Slack
- Webex
- Teams
- Discord
- Kubernetes events (reason=ClusterHealthCheck)
HealthCheck CRD
To define a custom health check, simply create a HealthCheck instance.
The HealthCheck
specification can can contain the below fields:
`Spec.Group*/*Spec.Version*/*Spec.Kind
fields indicates which Kubernetes resources the HealthCheck is for. Sveltos will watch and evaluate these resources anytime a change occurs;Spec.Namespace
field can be used to filter resources by namespace;Spec.LabelFilters
field can be used to filter resources by labels;Spec.Script
can contain a Lua script, which define a custom health check.
The Lua script must contain the function evaluate()
that returns a table with a status field (Healthy/Progressing/Degraded/Suspended) and optional message field.
When providing Sveltos with a Lua script, Sveltos expects following format:
- Must contain a function
function evaluate()
. The function is directly invoked and passed a Kubernetes resource (inside the functionobj
represents the passed in Kubernetes resource); 2.Must return a Lua table with following fields: status
: which can be set to either one of Healthy/Progressing/Degraded/Suspended;ignore
: is a boolean field indicating whether Sveltos should ignore the resource. If hs.ignore is set totrue
, Sveltos will ignore the resource causing that result;message
: is a string that can be set and Sveltos will print a message if it is set
Example: ConfigMap HealthCheck
In the follwoing example1, we are creating an HealthCheck that watches all the ConfigMap Kubernetes resources.
hs
is the health status object we will return to Sveltos. It must contain a status
attribute which indicates whether the resource is Healthy
, Progressing
, Degraded
or Suspended
. By default,the status is set to Healthy
and the hs.ignore
is set to true
, as we do not want to mess with the status of other, non-OPA ConfigMaps. Optionally, the health status object may also contain a message.
In this example, we want to identify if the ConfigMap is an OPA policy or another kind of ConfigMap. If it is a OPA policy, we retrieve the value of the openpolicyagent.org/policy-status annotation. The annotation is set to {"status":"ok"} if the policy loaded successfully. If errors occurred during loading (e.g., the policy contained a syntax error) the cause will be reported in the annotation. Depending on the value of the annotation, we set the status and message attributes appropriately.
At the end, we return the hs
object to Sveltos.
Example - HealthCheck Definition
---
apiVersion: lib.projectsveltos.io/v1beta1
kind: HealthCheck
metadata:
name: opa-configmaps
spec:
resourceSelectors:
- group: ""
version: v1
kind: ConfigMap
evaluateHealth: |
function evaluate()
statuses = {}
status = "Healthy"
message = ""
local opa_annotation = "openpolicyagent.org/policy-status"
for _,resource in ipairs(resources) do
if resource.metadata.annotations ~= nil then
if resource.metadata.annotations[opa_annotation] ~= nil then
if obj.metadata.annotations[opa_annotation] == '{"status":"ok"}' then
status = "Healthy"
message = "Policy loaded successfully"
else
status = "Degraded"
message = obj.metadata.annotations[opa_annotation]
end
table.insert(statuses, {resource=resource, status = status, message = message})
end
end
end
local hs = {}
if #statuses > 0 then
hs.resources = statuses
end
return hs
end
The below ClusterHealthCheck
resources, will send a Webex message as notification if a ConfigMap
with an incorrect OPA policy is detected.
---
apiVersion: lib.projectsveltos.io/v1beta1
kind: ClusterHealthCheck
metadata:
name: hc
spec:
clusterSelector:
matchLabels:
env: fv
livenessChecks:
- name: deployment
type: HealthCheck
livenessSourceRef:
kind: HealthCheck
apiVersion: lib.projectsveltos.io/v1beta1
name: opa-configmaps
notifications:
- name: webex
type: Webex
notificationRef:
apiVersion: v1
kind: Secret
name: webex
namespace: default
Tip
If the Lua language is preferred to write the HealthCheck, it might be handy to validate the definition before use.
This can be achieved by cloning the sveltos-agent repository. In the pkg/evaluation/healthchecks directory, create a directory for the deployed resources if it does not exist already. If a directory already exists, create a subdirectory instead.
In the directory or the subdirectory, create the below points.
- The file named healthcheck.yaml containing the HealthCheck instance with Lua script;
- The file named healthy.yaml containing a Kubernetes resource supposed to be Healthy for the Lua script created in #1 (this is optional);
- The file named progressing.yaml containing a Kubernetes resource supposed to be Progressing for the Lua script created in #1 (this is optional);
- The file named degraded.yaml containing a Kubernetes resource supposed to be Degraded for the Lua script created in #1 (this is optional);
- The file named suspended.yaml containing a Kubernetes resource supposed to be Suspended for the Lua script created in #1 (this is optional);
- make test
As mentioned above, one of the following statuses will get returned (Healthy
, Progressing
, Degraded
or Suspended
) once the resources are verified.
Notifications and multi-tenancy
If the below label is set on the HealthCheck instance created by the tenant admin
Sveltos will ensure the tenant admin can define notifications only by looking at the resources it has been authorized to by platform admin.
Sveltos suggests using the below Kyverno ClusterPolicy, which takes care of adding proper labels to each HealthCheck at creation time.
---
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: add-labels
annotations:
policies.kyverno.io/title: Add Labels
policies.kyverno.io/description: >-
Adds projectsveltos.io/admin-name label on each HealthCheck
created by tenant admin. It assumes each tenant admin is
represented in the management cluster by a ServiceAccount.
spec:
background: false
rules:
- exclude:
any:
- clusterRoles:
- cluster-admin
match:
all:
- resources:
kinds:
- HealthCheck
mutate:
patchStrategicMerge:
metadata:
labels:
+(projectsveltos.io/serviceaccount-name): '{{serviceAccountName}}'
+(projectsveltos.io/serviceaccount-namespace): '{{serviceAccountNamespace}}'
name: add-labels
validationFailureAction: enforce
-
Credit for this example to https://blog.cubieserver.de/2022/argocd-health-checks-for-opa-rules/ ↩