#Questions about EMS Event Subscription

1 messages · Page 1 of 1 (latest)

prisma steeple
#

I have read through the EMS documentation, as well as the following KB articles, but I am still confused about EMS messages:
https://docs.netapp.com/us-en/ontap/error-messages/index.html
https://kb.netapp.com/onprem/ontap/OS/What_is_EMS_and_what_is_the_difference_between_the_messages_in_etc_messages_and_etc_log_ems_files
https://kb.netapp.com/mgmt/AIQUM/How_to_configure_and_receive_alerts_from_ONTAP_EMS_Event_Subscription_in_Active_IQ_Unified_Manager

From what I am understanding from this, if we forward our EMS alerts to AIQUM, it really does nothing unless we specifically subscribe to each message type to get the resulting alert from AIQUM. In OnTap 9.8, there are 7162 different EMS messages (according to the "event route show"). Even if I were to just subscribe to ERROR and above, that is still 3270 different messages that we would need to subscribe to individually. There does not appear to be a way to subscribe to all EMS messages ERROR and above in AIQUM: "Multiple events (regardless of delimiter) or 'wildcard' events (example: snapmirror.status*) cannot be used in the 'EMS event name' field or the subscription process will fail."

So, how do you make sure that you are getting important EMS events about your clusters? If we are not subscribed to all of these events, are we missing important alerts about our clusters?

vestal dune
#

@glacial sand you around for some feedback/Q&A?

prisma steeple
#

Sorry, just realized I posted this in "ontapi-api". Meant to post it in just "ontap". Don't suppose there is a way to move it?

As far as the events I'm concerned about, that is part of the question... how do I know which events I SHOULD be concerned about? I think I would just want to be alerted on anything that is ERROR or above and then filter out any that turn out to be just noise. That way we can be sure we're not missing anything important.

forest cloak
#

It may be a challenge to filter it out. What will happen is that UM will generate an event called 'Error EMS Recived'. You will have to click on the event to go into the details.

#

Bear in mind that not every error level event may be important or applicable.

prisma steeple
# forest cloak Here is the list of automatically subscribed events https://docs.netapp.com/us-e...

Thanks, @forest cloak, that link is helpful, because I had never seen that before. I was never clear on what exactly AIQUM was monitoring out of the box.

We have subscribed to some events that we know are important (off the top of my head, I know that inodes nearly full is one of them), but how do we know what other alerts might be important if we have never gotten alerted on them? My concern is that something could be going wrong on the system that we aren't getting alerts on, but don't know about it unless we go in and look at the event logs.