#EMS setting ignored
1 messages · Page 1 of 1 (latest)
Hello @serene saddle short question about EMS. They where implemented with version 3.2 and enabled by default right?
Correct
Thanks 😊 need to have a deeper look on that
Do you have an example command to send test events?
@ruby frost You can try below from CLI
set diag
event generate -message-name arw.volume.state -values 1 2 3 4 5 6 7 8 9 10 11 12
Thanks @alpine heart just testing with sk.panic
Executed the command but did not saw any entry in prometheus on the nabox 🤔
@ruby frost I've tested this locally(not nabox) and it works. Could you please share your logs (https://netapp.github.io/harvest/23.08/help/log-collection/). Also, check if the record is in the Prometheus history by running query ems_events[24h]. Running ems_events alone only shows instant records.
event generate -message-name sk.panic -values 1
Hi @alpine heart as requested, if tested it again and send the nabox logs at the supportaddress
@ruby frost Thanks. Can you check if Ems collector is enabled. I don't see any logs for Ems collector in shared logs.
Thanks for checking. The checkbox in the nabox configuration for ems is enabled. Could it be that the config file also needs to be adjusted? @serene saddle is there something more to do?
I tried enabling Ems collector in NAbox 3.3 (2023-07-25) and it shows me EMS logs as below.
nabox-harvest2 | 2023-08-28T09:19:42Z INF collector/collector.go:483 > Collected Poller=A250-41-42-43 apiMs=1062 calcMs=0 collector=Ems:Ems instances=0 metrics=0 parseMs=0 pluginMs=0
@ruby frost From the logs you've provided, it appears that the EMS collector is set up in the Harvest config, but it doesn't seem to be running. We can try by restarting the pollers.
I‘ve just restarted the whole nabox and generated a new ems event, looks still the same
could you share output of below command
dc exec -w /conf nabox-harvest2 /netapp-harvest/bin/harvest doctor --print
<cluster>:
addr: -REDACTED-
autostart: '1'
collectors:
- Zapi
- ZapiPerf
- Rest
- Ems
datacenter: ZOI
password: -REDACTED-
prometheus_port: 12994
use_insecure_tls: true
username: -REDACTED-
default:
send_autosupport_stats: '0'
Looks that the EMS collector is enabled
Yes looks enabled. Could you share NABox and Harvest version?
Nabox 3.3 and harvest 23.08
@ruby frost Let's examine the startup logs for any potential issues that might not have been evident in the logs you previously shared. Below are the steps.
- Restart the Harvest service using the
dc downcommand. - Wait for approximately 10 minutes to allow the service to fully restart and generate new logs.
- Share the new logs @
ng-harvest-files@netapp.com.
Additionally, let's try running Ems collector for debug purpose. To do this, please execute the following commands, replacing POLLERNAME as applicable and share the output.
docker exec -it nabox-harvest2 sh
/netapp-harvest/bin/poller --poller POLLERNAME --loglevel=2 --collectors Ems
@alpine heart In the log from the poller is the following error " init collector-object error="auth failed => 401 Unauthorized" Poller=<cluster> collector=Ems object=Ems
That is interesting because the user which have definied hast the readonly role for everything
Thanks. Okay we have found the problem. Let's verify your role permissions, as EMS Collector requires permissions for REST calls. Could you run the following CLI command, replacing ROLE with your role name? Permissions details are available here https://netapp.github.io/harvest/23.08/prepare-cdot-clusters/
security login show -role ROLE
security login role show -role ROLE
User/Group Authentication Acct Authentication
Name Application Method Role Name Locked Method
nabox ontapi password readonly no none
Role Command/ Access
Vserver Name Directory Query Level
<cluster> readonly DEFAULT readonly
security readonly
security login password all
security login publickey all
security login role show-user-capability all
set all
we currently use the default readonly role
Okay. Can you give HTTP permission for use nabox via System Manager as attached below in screenshot.
You can do the same via CLI as below, replace USER/ROLE as applicable
security login create -user-or-group-name USER -application http -authentication-method ROLE -role ROLE
the Unauthorized error is gone but what I now see in the log is
Issue while reading prometheus port Poller=<cluster> exporter=nabox-prometheus
Unable to init exporter error="missing parameter => port" Poller=iefszoisc10 name=nabox-prometheus
exporter (nabox-prometheus) requested by (Ems:Ems) not available Poller=<cluster>
You can ignore this error as We were trying to run collector in debug mode.
Let's restart Harvest Pollers now and try running your ems test generation event. It should work this time.
Now it works thank you @alpine heart for your great support 😊
Gr8!
hello, i enabled the following collectors:
- Zapi
- ZapiPerf
- Rest
- Ems
have the right role & application:
Second
User/Group Authentication Acct Authentication
Name Application Method Role Name Locked Method
harvest2 http password harvest2-role no none
harvest2 ontapi password harvest2-role no none
and also have an event with severity EMERGENCY:
Time Node Severity Event
9/7/2023 14:10:00 <NODE> EMERGENCY monitor.globalStatus.critical: Power Supply Status Critical: PSU2. Disk shelf fault.
but still doesn´t see the EMS event in health the dashboard
@woven fulcrum By default, Harvest is configured to track EMS events as specified in this template: https://github.com/NetApp/harvest/blob/main/conf/ems/9.6.0/ems.yaml. Are you not seeing these events within Harvest? If an event is missing from this template, it will need to be added.
thank you! i thought every EMERGENCY event is tracked: The EMS collector gathers EMS events as defined in your ems.yml file. This panel displays events with emergency severity that occurred within the selected time range.