Canarytrace Listener
#
What you’ll learn
- What is a Canarytrace Listener
- How Listener work
Canarytrace Listener is an additional component that helps to select incidents or exceeded of thresholds from a lot of data that Canarytrace collects. You have graphs, dashboards and a lot of data that canarytrace has collected, but thanks to tunnel vision, an incident or threshold crossing can easily be overlooked.
#
How does it work?- Listener brings automated monitoring of thresholds exceedances and alerts according set of rules.
- Listener provides a set of built-in rules with recommended values for metrics such as ResponseTimes, LoadEventEnd, LongTasks, Web Vitals, sizing and format of images, and etc.
- Listener evaluates problems that the browser downloads from the server, such as downloading javascript or css files without compression.
- Listener works with a list of rules, each rule has a set threshold, reporter and score. If the threshold is exceeded, Listener send alert by the selected reporter.
- Each threshold has set of score. The table of score allows you to finely define very softly the severity of the incident. The score has a range from 0 to 100. The lower the better.
#
Rule exampleSend notification into our slack channel, when any response has responseTime is greater than 3000ms.
#
Score tableDescription | Score | Color |
---|---|---|
needs fix! | 0-30 | red |
needs improvement! | 31-70 | orange |
good job! | 71-100 | green |
#
RulesRules are the cornerstone of the Canarytrace Listener and there are three types:
Internal rules
- are built-in healthchecks for canarytrace and elasticsearch.Default rules
- built-in rules for monitoring thresholds of metrics and recommendations.Client rules
- we are preparing - the user will be able to set rules, thresholds and reporters himself.
match
#
Rule
- Search exactly value in a label
- Condition: index
c.report
in the last hour must containsfalse
in the labelpassed
and minimal2
finds.- If the condition is met, send the report to
slack
andevents
reporters.- You can use operator
must
andmust_not
.
min
is optional.reporters
can be one or more.
contains
#
Rule
- First condition: get a group of data by
field
andvalue
and second condition: search exactlyexpression.field
andexpression.values
- Condition: index
c.response
in the last hour with labelheaders.content-type
which containsjavascript
not containsgzip
orbr
in the fieldheaders.content-encoding
and minimal10
finds.- If the condition is met, send the report to
slack
andevents
reporters.- You can use operator
must
andmust_not
.
min
is optional.expression.value
can be one or more.reporters
can be one or more.
range
#
Rule
- Search greater or less value
- Condition: index
c.performance-entries
in the last hour must beresponseTime
greater than3000
and minimal10
finds.- If the condition is met, send the report to
slack
andevents
reporters.- You can use operator
gte
andlte
.
min
is optional.reporters
can be one or more.
#
Default rules
- Latest version is
1.6
- This table of rules is still under development
Latest version of Listener contains these built-in rules
# | title | Index | Condition | Min count /hour | Score |
---|---|---|---|---|---|
1 | Failed check your page! | c.report | test step failed | 2 | 10 |
2 | Encoding of response with Javascript files must contains gzip or brotli compression. | c.response | gzip or br missing in headers.content-encoding | 10 | 40 |
3 | Encoding of response with CSS files must contains gzip or brotli compression. | c.response | gzip or br missing in headers.content-encoding | 10 | 40 |
4 | Higher response time. | c.performance-entries | > 3000ms | 10 | 40 |
5 | WebVitals LCP exceeded. | c.audit | > 2500ms | 5 | 40 |
6 | WebVitals TTI exceeded. | c.audit | > 5000ms | 5 | 40 |
7 | WebVitals CLS exceeded. | c.audit | > 0.1 | 5 | 40 |
8 | LoadEventEnd exceeded. | c.performance-entries | > 4000ms | 5 | 40 |
title
name of ruleindex
where Listener is looking for incidentscondition
if the condition is met, an report will be sentmin count / hour
an report will be sent, if is min count of incidents foundscore
reporter ma priznak a barvu podle score
#
Internal rulesInternal check | Score | Description |
---|---|---|
⚙️ Canarytrace is down | 0 | Canarytrace did not send data to c.report-* index |
⚙️ Elasticsearch health is yellow | 50 | Need improvements |
⚙️ Elasticsearch health is red | 0 | Elasticsearch is probably down |
⚙️ Java Heap is higher than 70% | 50 | Is close to overloaded and need improvements |
⚙️ Java Heap is higher than 85% | 50 | Elasticsearch nodes is overloaded |
canaryIsLive
canarytrace runhealthCheck
elasticsearch status and pending tasksnodesCheck
elasticsearch java heap, max ram, max disk used, max cpu
#
Reporters
- You can use on or more reporters
#
Slack reporter#
Email reporter#
Events reportersdasddasd
#
CLI
- Canarytrace Listener settings by environment variables
Elasticsearch
ELASTIC_REQUEST_TIMEOUT
default1000
msELASTIC_CLUSTER
defaulthttp://localhost:9200
ELASTIC_HTTP_AUTH
default noINDEX_PREFIX
default isc.
QUERY_SIZE
default is20
, lower is better for performanceKIBANA_ENDPOINT
link to kibana
Slack
SLACK_WEBHOOK_URL
for clients slack
Email All params are required
EMAIL_SMTP_SERVER
EMAIL_SMTP_PORT
default465
EMAIL_SMTP_USER
EMAIL_SMTP_PASS
Healthcheck
CANARY_HEALTHCHECK
defaultallow
ELASTIC_HEALTHCHECK
defaultallow
ELASTIC_NODES_HEALTHCHECK
defaultallow
Tags
TAGS
exampleeshop;monitoring
#
How to run Canarytrace Listener
- Canarytrace is dockerized component with complete Kubernetes objects.
- Kubernetes deployments are not public.
namespace.yaml
- create acanarytrace
namespace.config-rules.yaml
- collection of rules.secret.yaml
- contains all sensitive configuration (elasticsearch, slack webhook, smtp user etc.)listener.yaml
- main deployment script of Canarytrace Listener / CronJob.
- Do you find mistake or have any questions? Please create issue, thanks 👍
- Have more questions? Contact us.