#Agent side alert to Slack configured and tested but not sent if triggered

1 messages · Page 1 of 1 (latest)

coarse pier
#

Hi. I've configured sending alerts from agent to Slack. I've tested it with alert-notify.sh script and it works. I've also configured filecheck plugin to check file. It works, charts are present in Web. And finally - I have a configured alert for filecheck like this:

# cat health.d/filecheck.conf
template: filecheck_file_size_template
on: filecheck.file_size
lookup: max -1m absolute foreach *
calc: $this / 1000000000
units: GB
every: 60s
crit: $this > (($status == $CRITICAL) ? (3)) : (5)
warn: $this > (($status >= $WARNING) ? (1)) : (3)
info: Head dump file too big ($this GB)
red: 5
green: 1

And this alert present in alarms?all and alarm_log (with statuses UNDEFINED, UNINITIALIZED & REMOVED, but not CRITICAL or WARNING).
What I'm doing wrong?

#

It is so complicated when you can't see is alert configured and if it works in a Web interface ((

sharp grove
#

Hey, you are right, it is complicated. We are working on implementing configuring Netdata and its components via UI and alerts and data collection jobs will be first things that support it.

#

I think it is not possible to add a template with foreach to the filecheck charts because dimensions have / (file paths). I will update filecheck collector tomorrow, it will create a chart for each file, not a dimension as it it now and it will be easier to configure alerts. I will provide an alert example for you.

#

This change will be in the Friday's nightly release.

coarse pier
#

Greatest thanks! Will be waiting for update!

#

Oh, just forget - I'm using stable release. So I can't get the latest changes from nightly. Can I just change template to alarm? Actually there just a single file which i want to monitor...

sharp grove
#

Yep, alarm will do. I will give you example alarm tomorrow

coarse pier
#

Hi. Any updates?

coarse pier
#

I've rewritten it myself to the next sample (do not pay attention to values, it is experimental):

alarm: filecheck_file_size
on: filecheck.file_size
type: Storage
lookup: max -5m
calc: $this / 1000000000
units: GB
every: 1m
crit: $this > 1
warn: $this > 0.5
info: Head dump file too big ($this GB)
red: 1
green: 0.2

And this time this alarm not present in alarms?all output... ( What a hell happened? How to debug this?

sharp grove
#

because this alarm is wrong

#

Make sure:

  • the alarm key is unique if you plan to have multiple alarm
  • the on value is the chart id, not context (not filecheck.file_size)
#

The following alarm works for me:

 alarm: filecheck_file_size_myfile
    on: filecheck_files_example.file_size
  calc: ${/opt/sample.txt} / 1000000000
 units: GB
 every: 10s
  warn: $this > (($status >= $WARNING)  ? (1) : (3))
  crit: $this > (($status == $CRITICAL) ? (3) : (5))
  info: Head dump file too big
  to: sysadmin
#

/opt/sample.txt is the dimension (file path)

coarse pier
#

Thank you, I'll try

#

in calc variable should be a file path? Right?

sharp grove
#

yes

coarse pier
#

@sharp grove thank you! This works!

sharp grove
#

Keep in mind that I am going to change this collector and this alarm will stop working, keep an eye on deprecation notes of the upcoming releases

coarse pier
#

Interesting thing - there is no recovery notification in Slack, but it is to email... It happened on a single server of 2