#EMS collector vifmgr.cluscheck.hwerrors

1 messages · Page 1 of 1 (latest)

formal reef
#

We see errors in event log like this
8/15/2023 03:10:55 flc1-04-noprod-ash-storage ALERT vifmgr.cluscheck.hwerrors: Port a4a-80 on node flc1-04-noprod-ash-storage is reporting a high number (at least 1 per 1000 packets) of observed hardware errors (CRC, length, alignment, dropped).
Added "- name: vifmgr.cluscheck.hwerrors" to our custom_ems.yaml file but havest does not collect this event. We have other custom events and harvest collect them. Our custom_ems.yaml file is attached. We use harvest 23.05. Ontap version is 9.10.1P12. Thanks!

plush trench
#

hi @formal reef I added - name: vifmgr.cluscheck.hwerrors to my conf/ems/9.6.0/ems.yaml file. Then generated that event on a cluster that Harvest is monitoring and Harvest exports the following metric.
ems_events{cluster="umeng-aff300-01-02",cluster_uuid="cbd1757b-0580-11e8-bd9d-00a098d39e12",datacenter="dc-1",index="51388803",message="vifmgr.cluscheck.hwerrors",node="umeng-aff300-03",node_uuid="28e14eab-0580-11e8-bd9d-00a098d39e12",severity="alert"} 1

#

is that what you want? Perhaps the event has not happened again after updating your custom_ems.yaml?

formal reef
#

yeah that's what we want. i guess although event log show still has the alert but they are not new? maybe i misunderstood EMS collector. it collects "new" events when they trigger, not existing events.

plush trench
#

that's right, the EMS collector will collect ONTAP EMS events that happen after the poller is started. It does not query ONTAP for historical events