In this post Matthew Clemente discusses FusionReactor for Docker Swarm (Part 2, Alerting). This article originally appeared on his blog
Alerting in FusionReactor for Docker Swarm
Previously I blogged, in earlier posts, about installing and deploying FusionReactor Cloud to monitor CFML applications on Docker Swarm. The next step is to to configure its updated alerting system to let you know if anything is amiss.
advanced alerting capability allows you to respond quickly to application errors or performance issues by creating metric threshold or status alerts.
Cloud features integrated support for many standard alerting tools, such as PagerDuty, Slack, HipChat, VictorOps – as well as supporting alerts via email or web-hook.
Basics of Alerting in FusionReactor for Docker Swarm
Alerting is one of the primary three tabs within the FusionReactor Cloud portal, with sub-tabs for Alerts, Checks, and Subscriptions.
- Subscriptions are the who and how of Cloud alerts – that is, who should get alerted and how should they receive the alert? Subscriptions also provide control over when – you can limit them to certain days and/or times. To be of any use, a Subscription must be attached to a Check.
- Checks refer to the circumstances necessary to trigger an Alert. Broadly speaking, a Check is tied to either the status of a server or a defined set of data points (Memory Usage, Response Time, etc). Subscriptions are attached to these Checks and triggered when the threshold or status is met. These are the why – as in, why should an Alert be sent.
- Alerts, finally, are the what. Alerts are the consequence of Checks and Subscriptions – when a Check triggers a Subscription, an Alert is sent and logged.
So, taking them one at a time, in a little more depth:
Setting Subscriptions – Alerting in FusionReactor for Docker Swarm
As you’ll see, FusionReactor Cloud offers far more granular control than the on-premise edition – this comes at the cost of significantly more configuration necessary to get it up and running.
In order to set up a Subscription, you’ll first need to enable one of the many available integrations. An integration is a way of contacting someone; options include Slack, PagerDuty, custom webhooks, and more. We’ll just stick with Email, as it’s the easiest and most obvious choice.
Now, I’m of the opinion that the Email integration should be enabled by default, but it’s not. So, once you’re logged into your account, click the Account button in the upper right, and select the Configuration option.
Click the Configure button for Email, then all you need to do is choose the Save option… and the Email integration is enabled.
On to the actual Subscription setup. Proceed to the Alerting tab and the Subscriptions sub-menu. In the upper right, the +Subscription will open a tab for actually setting up who gets alerts (and when it happens).
Here, you start to see the level of control that Alerting in FusionReactor Cloud provides. You can restrict this Subscription to certain days of the week, or hours of the day. This may be helpful, for example, if you have a separate team for after-hours. Or, you can have separate Subscriptions for Warning vs. Error states – the latter being set to higher priority, or sent via a different integration.
We’re setting up a basic email subscription, so we’ll name it “Web Team Email Alerts”, or something comparable, and choose the Email Service.
The default subject provided, “FusionReactor alert”, is fine; in the actual alert email this will be followed by the name of the Check that was triggered. Finally, we’ll add the email address we want to send the alert to, and click “Save”.
Your Subscription is now saved; you’ll see its details listed in the main Subscriptions panel, with the ability to duplicate/edit/delete, as well as test it. It hasn’t been linked to any Checks yet, so let’s do that next.
How to configure Checks – Alerting in FusionReactor for Docker Swarm
In the Alerting section, within the Checks sub-menu, click +Check in the upper right to open the tab for building out a new Check.
Within this panel there’s a lot going on. I’ll walk through the options one step at time, but we’ll save actually creating a Check for the next blog post.
- There are two types of Checks, Threshold and Status.Status checks are the simpler of the two – they are triggered if a server or group of servers goes offline for a period of time.Threshold checks, as explained in the documentation, “are used to alert when a metric value crosses a defined threshold.” Basically, FusionReactor is continually monitoring hundreds of metrics from your servers/applications (CPU Usage, Active Requests, etc.). Threshold checks enable you to select one of these metrics and, based on a percentage or frequency of its occurrence, trigger an Alert.
Note that the Name entered here for the Check will be included in Subscription notifications that get sent, so it should be descriptive and clear. Optionally, you can also fill in a Description, for internal reference.
- The Check is further refined by selecting the type of entity it should be applied to: Server Instance, Group, or Application.Server Instance: Could be a physical server, VM, or a container instance. If you’ve got three container replicas for a Swarm service, each is a server instance. Generally, for Swarm deployments, you won’t be choosing this option.Group: Server instances can be added to one or more groups – this is done on the startup of the server, via Java properties passed to FusionReactor. The FusionReactor module for CommandBox makes these easy to set, via
server.json. I’ve found Groups very helpful when deploying to Swarm; they make it easy to monitor multiple container replicas as a single entity.
Application: The name says it; your applications are listed here. Even if the application is replicated across nodes, you can apply a Check to it.
- Checks are built by first selecting a metric you’d like monitored and then defining a threshold at which you’d consider it an error – for example, if the metric Memory Usage exceeds the threshold of 80%. Three additional selectors, formatted as a sentence, can then be used to further specify when the Check should be triggered.This “sentence” format is meant to be an intuitive way to define your Checks. Let’s tackle its options in reverse order:
- Check timeframe: The Cloud alerting engine runs once every 60 seconds. In order to allow time for ingesting/synchronizing data, the smallest window you can select here is 5 minutes. This is the timeframe within which FusionReactor monitors your error threshold, as well as the period of time it will take for any erroring Checks to return to an OK state. While I’m sure longer timeframes have their uses, I want my error notifications to be as close to realtime as possible, so I’ve exclusively used the “5 minutes” option here.
- Greater/Less Than: Set the error state as greater or less than the threshold.
- Data point type: There are four possible options here, three of which are self-explanatory: single, all, and average. I found the count of option to be less intuitive; it brings up an additional option displayed as ( 1/5 ), where you select the first number.The second number, I found out, refers to the timeframe of your check. Because the engine runs every minute, within a 5 minute timeframe there are 5 data points (so longer timeframes increase the second number). Selecting 1/5 would be the same as single, and selecting 5/5 is the same as all – the count of option gives you the ability to choose everything in between.
- Pick the subscription(s) that this check should trigger. Checks can trigger more than one subscription alert.
- Finally, a preview graph is provided underneath, illustrating how your Check compares to the metric historically, so that you can see when it would be triggered.
After clicking Save to add the Check, you’ll see its details listed in the main Checks panel. As with Subscriptions, from here you can duplicate, edit, or delete it. You can also temporarily disable Checks from this page.
Configuration of Alerting in FusionReactor for Docker Swarm
Finally, we’ll move on to Alerts; fortunately, there’s nothing we need to configure here. The Alerts panel provides logged reporting of your Checks; any time that a Check changes status, it’s recorded here.
Basic logging here includes when Checks are added, paused, or deleted. These changes are recorded, but do not trigger Subscriptions. When a Check moves between the OK, WARNING, and ERROR statuses, the Subscriptions that are triggered will also be recorded here. The “View” option will open a panel with more information about the Alert/Check/Subscription.
Get free trial: Alerting in FusionReactor for Docker Swarm
If you want to see the power of alerting in FusionReactor for docker swarm for yourself you can get a trial of FusionReactor Cloud for free, just create your account and download the software.
For more information about setting up Alerting in FusionReactor for Docker Swarm contact the FusionReactor Cloud team.